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PREFACE 

This report describes part of a comprehensive and continuing pro- 
gram of research concerned with advancing the state-of-the-art in 
remote sensitig of the environment from aircraft and satellites. The 
research is being carried out for NASA's Lyndon B. Johnson Space Center, 
Houston, Texa/i, by the Environmental Research Institute of Michigan 
(ERIM). The basic objective of this multidisciplinary program is to 
develop remote sensing as a practical tool to provide the planner and 
decision-maker with extensive information quickly and economically. 

Timely Information obtained by remote sensing can be important to 
such people as the farmer, the city planner, the conservationist, and 
others concerned with problems such as crop yield and disease, urban 
land studies and development, water pollution, and forest management. 

The scope of our program Includes: 

1. extending the understanding of basic processes. 

2. discovering new applications, developing advanced remote- 
sensing systems, and improving automatic data processing to 
extract information in a useful form. 

3. assisting in data collection, processing, analysis, and ground- 
truth verification. 

The research described herein was performed under NASA Contract 
NAS9-14988 and covers the period from May 15, 1976 through Nov 14, 1977. 
IrDale Browne (SF3) was the NASA Contract Technical Monitor. The 
program was difected by Richard R. Legault, Vice-President of ERIM and 
Head of the Infrared and Optics Division, Quentin A. Holmes, Head of 
the Information & Analysis Department, and Richard F. Nalepka, Principal 
Investigator and Head of the Multlspectral Analysis Section. 

The authors acknowledge the contributions of G, Thomas who was a 
co-innovator of the screening program and who assisted gfeatiy in the 

development of the overall structure and the initial demonstration of 
Procedure B, 
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W. Holsztynski contributed to the training selection procedure and 
provided mathematical Insight into the problems of ancillary variable 
depend^':Xi. signatures t 

Ri Hieber, A. Metzger, S. Lindner, and H. Gallarda assisted in 
many aspects of programming support and ground truth analysis* 

One of the central themes of the procedure we have developed is 
unsupervised clustering and post-cluster labeling and estimation* This 
approach, with numerous variations, has come to us from many sources 
within ERIM and JSC. 
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INTRODUCTIOK 

The purpose of Lhis work is to develop improved techniques for 
estimating the amount of an agricultural crop present over a large geo- 
graphical region or partition. Specifically, the problem is to define 
a training and classification procedure that will allow valid estimates 
for the region to be made on the basis of training information obtained 
from a few segments. Thus the data to be processed is of two types; 
a small amount of labeled training data from some segments in the region 
and a large amount of unlabeled data from these and other segments. 

Remotely sensed data has three main attributes: spectral, spatial 

and temporal. The spectral profile for each pixel is provided by the 
multispectral scanner. The spatial characteristics include the line 
and point number and the position of the segment in the region. The 
temporal characteristics include the changes associated with the passage 
of time during the growing season. 

The problem o± estimating and classifying in a wide region is com- 
plicated by several sources of variation in the data. These include: 

a. systematic external effects, such as haze, viewing angle, sun 
zenith angle, and scanner calibration which occur independently 
of which particular crop is in a particular pixel 

b. effects upon particular crops which depend upon ancillary 
variables such as moisture, growing degree days, crop calendar, 
etc.., which are observable for an entire segment 

c. random noise, due to scanner noise, to within-site variation 

of an ancillary variable whose site average is knowi, or finally 
to variation in underlying anelllafy conditions which are sig- 
nificant in their effects but not being eurrently observed. 

Regarding item e, the .noise properties exhibited by multispecttal 
scanner ita are generally highly spatially correlated . A simple example 

■ ■ - , ■ "1 




FO(JUeRLy WILLOW BUN LaBOBATORIES the UNIVtRSrrV OF MICHIGAN 


Is provided by the fact that the within field variance of the multi- 
spectral scanner signals is less than the between field variance of 
multiple examples of the same crop, hence the choice of a reasonable 
number of pixels from a single field within a segment may constitute 
insufficient training for that segment. Similarly the choice of a 
single segment and the fields within that segment as a training data 
base for a group of segments may constitute insufficient training for 
that group of segments. For this reason, our development of a training 
procedure has proceeded on the assumption that multiple segments are 
needed for training in order to represent the variability of data p res- 
ent In a group of segments. 

Some of the data variation can be removed by correcting for known 
external effects, so that ?tli data are transformed to a standard reference 
condition. In the procedure herein described, the data are screened so 
that pixels containing clouds, cloud shadows, bad data and water are 
detected and flagged, and a haze diagnostic is computed over the array 
of good data points. Next the data are corrected for differences in 
satellite calibration (i.e. , all Landsat 1 data Is modified to simulate 
Landsat II data), for sun zenith angle (i.e., all data is made to look 
as though it was gathered with a sun zenith angle of 39°), and haze 
(all data is transformed to a standard haze condltibn). 

The noise variation can he lessened and operating efficiency 
greatly increased by adopting methods of data compression that preserve 
useful information while averaging out noise. In our procedure we com- 
press the data in two ways, spectrally and spatially, 

^ a linear transformation of 
the four Landsat bands through a matrix rotation called the Tasselled 
Cap transform (or the Kauth-Thomas transfoim) [1]. Most of the signifi- 
cant Information regarding agricultural scenes has been found to lie in 
the plane defined by the first two components of the resulting transformed 
data; hence, these are retained, and the last two components are discarded, 
resulting In a data compression of a factor of 2. 
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Spatial averaging is accomplished by grouping together pixels which 
are near to each other and are spectrally similar. The spectral average 
of the groups of pixels » the number of pixels in the group and the aver- 
age spatial positions of the groups are retained as; data features. The 
groups of pixels are ca3-led blobs [4], As a result of blobbing, the 
data are further compressed by a factor of 30.. 

After compressed data features have been extracted, several possi- 
bilities exist for making the proportion estimates that estabLish the 
crop acreage. They range from classification of the individuai feature 
vectors to maximum likelihood estimation of the proportions to post- 
classification sampling (bias correction) . 

Whatever the method Of estiinatibn, it is necessary to select train- 
ing data that adequately represent the range of data variation in the 
regiori. For reasons of economy, the training should be restricted to 
as small a number of segments as possible. Our training procedure, 
designed to meet these objectives, is to cluster the compressed data 
features into spectrally siihilar strata, and then choose a subset of 
segments that best represents these strata. The training data within 
the segments are ehosen. at random subject to the criteria that they 
represent the strata proportionally and are allocated among the training 
segments as evenly as possible. 

The method of tralnihg suggests a simple method of proportion esti- 
mation that we include as part of Procedure B. The proportion of wheat 
in each stratum is estimated directly from the training data and then 
an average of these stratum proportions, weighted by the number of pixels 
in each stratum, is calculated to obtain the proportion estimate for the 
region. This method of propottibn estlmatiori is not a heeeSsary part of 
the procedure. The selection, of representative training data could be 
followed by other, perhaps better, proportion estimation methods. 

In siramiaty, the steps of our procedure and the SectiohS of the 
report describing them are: 

3 




1. external effects correction Section 2.1 

2. spectral data compression Sections 2.1 and 2.2.1 

3. spatial data compression Section 2.2.2 

4. selection of training data Section 3 

5. proportion estimation Section 4 

Dependence of the compressed data features upon ancillary data has 
also been investigated in the effort reported here (Section 6.1). This 
effort is in a preliminary state of Investigation. 

Extensions to the procedure to address the problem of missing 
acquisitions and the problem of predicting segments likely to be helpful 
in the next biophase are described in Section 6.2, 
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PREPROCESSING/ FEATURE EXTRACTION 

The objectives of the preprocessing and feature extraction steps 
may be several [2]. Their purpose may be to; 

a. Make the data more comprehensible by adjusting all of it 
to standard conditions of observation. 

b. Eliminate or flag bad or noisy observations in the data. 

c. Make the data more comprehensible by extracting physical 
features or projecting the data in such a way as to display 
its physical structure. 

d. Compress the data, retaining most of the information and 
averaging over noise and redundaey. 

e. Make the distributions of the derived features fit some 
convenient model such as the multivariate normal distribution. 

Several cautions need to be mentioned. First, the above objectives 
may not all be met by the same preprocessing transformations, in fact 
in some cases they may be mutually exclusive. Sometimes it may be 
desirable to carry out transformations on parallel paths ending at 
different objectives; for example, a linear transformation might be desir- 
able for producing projections which can be examined by a researcher 
in order to gain insight, whereas a non-linear projection might be used 
to cause the preprocessed data to fit a normal distrlDUtion. 

Further, the term ''preprocessing" may be somewhat misleading in that 
it appears to define a computer architecture. In which first the pre- 
processing steps are performed, then the classification steps, then a 
proportion estimate is made, etc. In fact, however, all of the different 
conceptual steps constitute merely one functional relationship between 
the data and the desired output, and could in praGtice be carried out in 
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one step. The preprocessing steps we are defining are conceptual and 
might be implemented in a variety of architectures. 

The organization we have actually used is as follows; first the data 
is screened and diagnostic characteristics are extracted from the data. 
Second, corrections are made to the data and spectral features are ex- 
tracted. Third, spatial features are extracted. 

Finally, the process of defining feature extraction algorithms is 
itself largely outside the field of pattern recognition. "In light of 
all the possibilities mentioned above, what feature extraction steps 
might help solve the problem at hand?" Such a formulation might be posed 
to a man, but not to a machine, at least not yet. The features extracted 
are based on a gestalt, on an overall conceptual basis of what the key 
problem elements are. Ke proceed now to outline the conceptual basis for 
the preprocessing and feature extraction steps described in this report. 

The Conceptual Basis 

In order to decide what features to extract, one first of all 
needs some fairly accurate global picture of the data structure- We 
will concentrate first on the spectral structure of the Landsat data 
from an agricultural scene, with some reference to the temporal struc- 
ture. 

The Tasselled Cap 

The Tasselled Cap [L],[3], is a visual-graphic presentation of 
the structure of Landsat data in agricultural scenes. The Landsat MSS 
gathers data in four somewhat overlapping spectral bands. Band 4 through 
Band 7. the data acquired by Landsat is however quite spectrally cor- 
related so that in fact almost all the data lies on a 2-dlmensionai 
plane in the four dimensional space. Further, the data is confined to 
a triangular (Tasselled Cap) shaped region of that plane . 


6 
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Figures 1 and 2 describe these ideas in more detail. Figure 1(a) 
shows an artistic conception of a scatter plot of Band 5 va. Band 6. 

Band 5 Is centered in the chlorophyll absorption band of green vegetation 
around 0.65 pm whereas Band 6 is centered on the cellulose reflectance 
peak of green vegetation around 0.75 ym. The signal from green vegetation 
is thus found to be small in Band 5 and large in Band 6. Such a point 
is indicated on Figure 1(a) by the designation "green stuff". 

The main variability found in soils is in their brightness. Hence, 
the signals from bare soils are distributed primarily along a line ra- 
diating from the origin. 

Band 4 is centered on the cellulose reflectance around 0.55 ym 
but extends significantly into the chlorophyll absorption region, so that 
Bands 4 and 5 are highly correlated as shown in Figure 1(b). 

Figure 2 expands the description to three dimensions. Soil sample 
points fall near a line and, further, fall predominantly in a planar 
region surrounding that line. Hants start out on bare soil and grow 
towards the region of green stuff. Among the variety of green plant 
canopies, some have large amounts of shadow, which shifts the observation 
directly toward the origin. Trees are an example of a green canopy 
with a fair amount of shadow, as shown by the "badge of trees". The 
variety of shadowing in various canopies creates a region called the 
green arm. 

When a particular plant canopy has reached its own appropriate 
stage of maximum green development it may then yellow. Often yellowing 
is accompanied by darkening, either in the vegetation itself or due to 
the drooping of its leaves which causes shadowing to increase. Either 
by yellowing, withering, or by being cut down, every canopy eventually 

returns to the soil from whence it came. 

In Landsat data the yellow pDlnt is so close to the "side" of the 
tasselled cap that it is almost indistinguishable. Hence, for most 
purposes we may say that the agricultural information is substantially 
contained in a plane defined by unit vectors in the "brightness" and 
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FIGURE 1(a). BANDS 5 AND 6 ARE FAIRLY UNCORRELATED 


Band 5 



Band ^ 


FIGURE Kb). BANDS 4 AND 5 ARE HIGHLY CORRELATED 
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"green stuff" directions, but with a small amount of variation in the 
"yellow stuff" direction perpendicular to the plane, 'Cv fourth direction 
"non-such" , orthogonal to the other three, contains primarily noise variation. 

Figure 3 shows some typical cluster plots of Landsat data front 5x6 
mile sample segments. In these figures the abscissa is "brightness" and 
the ordinate is "greenstuff". Notice that, in any particular scene, por- 
tionst of the Tasselled. Cap structure may be missihg, depending on the 
crops planted and their state of development and on the types of soil. 

(The numbers on the ellipses are only for identification.) 

Addlclpnal StruCLural Characteristics of the Data 

Signals from vegetation and soil form the Tasselled Cap. Water, 
clouds, cloud shadow, and haze also have their proper places in the four 
dimensions of Landsat signal space, and can be described relative to 
the position of the Tasselled Cap and the points already defined, as 
shown In Figure 4. 

Haze Correction 

Figure 4 shows that clouds are located out along the brightness 
axis, but shifted in the negative yellow direction as well. Haze can 
be thought of as intermediate or thin cloud. When haze is present 
over a scene the contrast of the scene itself is reduced while a portion 
of a cloud-like signal is added. The intermediate condition between 
no haze and completely hazy (i.e., cloud) is sketched in Figure 5 as 
a shaded region. We see that as the haze amount is increased the tri- 
angular shape of the tasselled cap shrinks towards the point of clouds. 

This demonstrates why haze has a severe effect upon signature ex- 
tension capability* The entire data structure shifts out along the 
brightness axis and shrinks in the green direction, but worse, the 
shift in the negative yellow direction is sufficient to move the 
Tasselled Cap sidewise completely off of its haze free image. 

If the data points are first projected onto the two-dimensional 
brightness-green plane the problem becomes less severe. Even then, 
however, there will be significant differences between hazy and haze 
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Iree signatures. The proper thing to do is to measure the amount of 
haze present and adjust the data back to some reference condition. 

For tuna teiy, for the data we have examined to date, the measurement 
of the amount of haze is possible by measuring the shift of the data in 
the negative yellow direction. 

In order to actually correct the data an atmospheric model which 
describes the effect of haze is needed. The entire procedure of making 
the haze measurement and applying the atmosphere model was developed 
by P. Lambeck of ERIM and is called the XSTAR algorithm. It is described 
in detail In Reference 5. It will be described in summary form in a later 
section in this report. 

The Effects Of Sun Zenith Angle 

To a first order of correction the radiance incident at the Land- 
sat detector system decreases directly with the cosine of the zenith 
angle of the sun at the earth point viewed. (This would be exactly so If 
the atmosphere haze scattering and the ground reflectance both behaved 
as Lambertian reflectors.) In order to make the data gathered under 
different sun zenith angles commensurate, it is all corrected to the 
standard zenith angle of 39“ which is typical of Landsat data over Kansas 
in April. This is done before haze correction so that the XSTAR algorithm 
can operate on a more standardized set of data. The form of the corrected 
data is 

cos 0 

1 G 

X = —5^ X 
cos 0 

where 

0 is the sun zenith angle and 6^ = 39% X is the data vector and 

■ . » 

X Is the data vector corrected fox sun angle i 

Correction For Satellite Galibratiort 

During the course pf this . work it was found useful to j.ncorporate 
data from both Landsat 1 and Landsat 2 passes, although Landsat 2 was 
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the primaty available satellite. The two satellite sensors have slightly 
different calibrations, which would hardly be noticeable except when 
attempting machine processing. Hence, it was necessary to find a trans- 
formation which would convert the Landsat 1 data to be commensurate with 
the Landsat 2 data, The procedure by which this was accomplished is 
detailed in Appendix I. Briefly, it consists of comparing pairs of 
Landsat 1 and Landsat 2 observations on the same sample segments on 
successive (9 day separated) passes. 

In order to actually carry out the fit it was necessary to take 
account of the differing haze levels and differing sun zenith angles for 
the two observations, and to, in effect, assume a nOn-^linear model for 
each of the satellites (although a useful linear relationship between 
the two was found) . 

2.1 SUMMARY OF PREPROCESSING STEPS 

We have now discussed most aspects of the preprocessing in a con- 
ceptual way. The actual corrections were Implemented in a two-step 
process. In the first step the raw data was "screened'* and haze diag- 
nostics were calculated. In the second step the haze correction and all 
other corrections were applied to the data. (In the same step spectral 
features were extracted, as will be described in Section 2.2.1.) 

2.1.1 SCREENING* 

The various data screening procedures are described in this section. 
During this process each acqiiisltion is screened and each pixel is 

flagged as to its status: 

Bad Data 

Cloud 


The screening parameters shown in this section are the ones that were 
actually used in the work described in this report. Subsequent^ 
improved version of the screening program, incorporating revised defin- 
itions of the decision surf aGes, has been developed by P. Lambeck and 
is documented in Reference 5. If this data were to be reprocessed 
today, the improved versibh would be used;^ The conci^tual basis of the 
screening program is the same in both versions. 

15 ■ 




Water 

Cloud Shadow 

Good Pixel, 

In the following, R’s are four component Tassellad Cap unit vectors 
and C's are scalars. It is assumed that the data has been converted to 
be conraiertsurate with Landsat 2. Table 1 lists the values for these co- 
efficients as well es all others used in this section. 

Bad Data 

If any of the following conditions are true, the pixel, x, is labeled 
bad data. 



= . 732 ^ + . 682 ^ > 


The Tasselled Cap transform used here was develpped by P> Lambeck 
for use on Landsat 2 data. The original transform. Reference 1, 
was,.developed'fbr,Landsat lvV"f, 
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def 

Z* = .73Z„ “ .68Z, > C 
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where 


_ def /c^ . 

1 ycos 6y ^ 


The above tests effectively enelose the Tasselled Gap, cloud, watet 
and cloud shadow. Most bad data points will be excluded by this test, 
since the volume enclosed is only a small fraction of the total volume 
of the Landsat signal space. 

Cloud 

If not bad data and if the following is true, the pixel is labeled 
cloud. 


Water 

if not bad data or cloud and if the following is true, the pixel 
is labeled water. 

h ^ H ^ ^W2 

Cloud Shadow 

If not any of the above and if the following is true, the pixel 
islabeled cloud shadow. 


good Pixel 

If none of the above, the pixel is labeled good. 

2.1.2 DIAGNOSTIC PROCEDURES (INTERIM) 

Diagnostic procedures consist of measuring the data average, the 
soil mean, and the green arm mean. 
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TABLE 1. GOEFFICTENTS FOR FEATURE EXTRACTION. LANDSAT 2 


Coeffielent 

Name Symbol Lands at 2 Band Value 

Reference solar 6 39° 

zenith angle o 


Brightnes.9 
unit vector 


Greenstuff 
unit vector 


Yellow stuff 
unit vector 


Non- such 
urilt vector 


Bad data 
thresholds 











4 

5 

6 
7 

4 

5 

6 
7 

4 

5 

6 
7 

4 

5 

6 
7 
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0.33231 

0.60316 

0.67581 

0.26278 

-0.28317 

-0.66006 

0.57735 

0.38833 

-0.89952 

0.42830 

0.07592 

-0.04080 

-0.01594 

0.13068 

-0.45187 

0.88232 

16 

-8 ' ' 

+8 

-32 

-24 

220 
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TABLE 1 . (Continued) 





Coefficient 

Name 

Symbol 

Landsat 2 Band 

Value (s) 

Cloud Threshold 

"c 


145 

Water Thresholds 

^W 1 , Sv2 


64 , -10 

Cloud Shadow 



10 

Threshold 

D 


Atmosphere effect 

“l 

4 

1.2680 


“2 

5 

1.0445 


“3 

6 

0.9142 


% 

7 

0.7734 


* 

X2 

4 

60.9 


5 

64.7 


A 

^3 

6 

75.7 


A 

^4 

7 

29.7 

Yellow shift 

Z* 


8.5 

reference 

j 



Data Average 

Measurement 




From the set of good pixels ealculate 
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where the are simple averages over these pixels. 

Soil Mean Measurement 

From the good pixels calculate 

) 

where 

Y = lower 5% level of histogram of Z_ 

^2 2 

Y = lower 5% level of histogram of Z 

M 3 



Green Arm Measuremeat 
W 

where 

20 
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Term 


and where 


^ = upper 5% level of Z 

A 

W = average of Z values included above W 


W- - average of Z„ values included above W, 
3 3 " 


= average of Z^ values Included above 


2.1,3 Correction. Procedures 

Correction procedures consist of two steps, sun zenith angle correc 
tion and haze correction. 

Sun zenith angle correction can be accomplished by utilizing the 
following relation; 


X 

where 

6^ is a reference angle, 
given acquisition. 

Haze correction, with the XSTAR algorithm uses the following 
relation: 

x" = Ax’ + B 


cos 6 
o 

cos 0 


and 0 is the solar zenith angle for the 


where A is a diagonal matrix and B is a vector. 

The formula for the diagonal elements of A is 






and for the elements of B it Is 


( ”V^ * 

B^ = U - e ^ 7 

where y is chosen to minimize the following function. 
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where 

^ are the components of the (yellow stuff) unit vector, and 

X = RW.'*’ 

In summary of the screening and diagnostic procedures, each pixel 
is corrected to the extent known a priori (that is, satellite cor- 
rected and cosine corrected but not haze corrected) and projected onto 
the Tasselled Cap coordinate system. In this coordinate systein certain 
standard screening procedures are Garried out. The pixels which are 
still labeled good after passing the screening tests are used to cal- 
culate diagnostic characteristics, primarily for the purpose of esti- 
mating the amount of haze present in the scene. 

The output of screening is the original raw data, untransformed 
and unmodified in value, but with an additional channel which is used 
to flag the data status (cloud, water, etc.). The diagnostics are trans- 
formed back so as to be compatible with the original raw data and become 
part of the data description which is carried with the data as header 
information. The diagnostics are certain average points in the data 
structure. If later an imp.voved satellite transformation is developed 
or an improved sun angle c rrection, these diagnostics will still be 
valid, since they are taken with respect to raw (untransformed) data. 

After screening, the hazie correction is calculated and then all of 
the corrections are applied, to the data in one step to avoid developing 
round off errors in the 8-bit data words. 

The green arm feature was actually used in the corrections used in this 
work. However, the data average feature y, appears to be equally use- 
ful, is easier to calculate, and is preferred by P. Lambeck. The lat- 
est version of the XSTAR algorithm is documented in Reference 5. The 
above expression Is approximated by a quadratic form for ease of cal- 
culation. 


formerly willow RUN'LABORATdRiES. THE UNIVERSITY Or MICHIGAN 



2.2 FEATURE EXTRACTION 

The two main feature extraction steps are spectral and spatial. 

The spatial feature extraction step, in current practice at ERIM, usually 
is carried out on registered multi temporal data. 

2.2.1 SPECTRAL FEATURE EXTRACTION 

For the work reported here the spectral features that have been 
extracted are the first two Tasselled Cap features, brightness and green 
applied after haze correction. For passes which are late in the season 
for wheat (4th biowindow) we have also retained the 3rd Tasselled Cap 
feature, yellow stuff, (This is a minor point in as much as we have 
concentrated on acreage estimates during the first 3 biowindows.) 

As was mentioned earlier, this step is combined with the data 
correction step above so as to minimize round— off errors. 

2.2.2 SPATIAL FEATURE EXTRACTION 

Landsat data over agricultural scenes has the characteristic that 
it tends to be homogeneous in fields. Each field may contain anywhere 
from a few to 100 or so pixels, so an opportunity exists to compress 
the Landsat data considerably, depending on the application. Further- 
more, in the process of compressing the data an opportunity exists to 
average over the system noise. 

How much can be gained in the way of system noise reduction depends 
primarily on how strong and recognizable the scene spatial structure is. 
If, for an extreme example, all fields were known to be 40 acres and 
arranged in a uniform checkerboard pattern, then this spatial structure 
could be exploited to average exactly over the pixels in each field. 

Very little of the total Information in the scene would be "used up" 
by the process of fitting the data to the required 1/4 ml x 1/4 mi grid. 

In the real world however, fields vary in size, and many are irreg- 
ular In shape. Hence, the process of defining fields by finding their 
boundaries depends upon very local information, e.g. , on the differences 
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(gradients) between, neighboring pixels* Much of the total information 
in the scene gets used up by such a process. 

When one examines an image of an agricultural scene In Landsat data 
one is impressed with the large huiriber of straight boundaries which are 
evident. It would be Interesting to attempt to explore these boundaries. 
Yet our opinion is that most of the useful information is contained simply 
in the concept of "nearby". Pixels which are near to each other are more 
likely to be the same class than pixels which are far away. Using this 
concept avoids the difficulty of explicitly defining boundaries. 

Our approach in spatial feature extraction is to exploit this 
nearness idea to the extent that we can in a natural and economical way. 

We want to do this by grouping together pixels which are spectrally and 
spatially similar. Further, if fields are small, so that very few large 
groups exist, we would like our procedure to default to the processing 
of single pixels, 

Ihe procedure we have adopted to carry out these functions is, 
briefly, as follows. 

1. To each pixel append additional channels which contain the 
spatial coordinates of the pixel* i.e., the line and point 
number of the pixel. 

2. Cluster the pixels in any treasonable cbnveritlphal way, using 
both lipectral and spatial values in computing the distance 
measure. 

3. Compute statistics for each cluster generated, including 

in particular the average spectral vector, the average spatial 
vector and the number of points included in the cluster. This 
information constitutes the compressed data. 

The computer program which carries out these pperations is called 
BLOB {3], [4] and the clusters produced by it, blobs. 
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For purposes of acquiring training data it is desirable to use 
only the pure part of blobs, that is, only the pixels which are not 
contaminated by the signal from adjacent blobs. For this purpose we 
have defined ’’stripped" blobs. Stripped blobs are those from which 
any pixels which were adjacent to any pixels of a neighboring blob have 
been deleted. The blob statistics are calculated only for the residue 
of undeleted pixels, and so are ’’pure*’ statistics. 

All of the steps through spectral feature extraction, above, are 
carried out on individual passes of Lahdsat. Blobbing, on the other 
hand is carried out on registered multltemporal data. (Some farther 
notes on blobbing will be found in Section 4.2 and Figures 7 and 8.) 
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PROCEDURE B TRAINING AND PROPORTION ESTIMATION 

We have discussed the first major element in any remote sensing in- 
ventory system; namely, preprocessing/f eature extraction. The next 
major question is, how are these features to be used to derive the re- 
quired outputs of the inventory? In our case, the acreage proportions 
of wheat are estimated for each of the available 5x6 sample segments. 
ConventiGnally, one proceeds by obtaining labeled samples, using the 
samples to estimate the feature density function for each of the classes 
to be estimated, and then making a proportion estimate using that estimat- 
ed feature density function and the rest of the available (unlabeled) 
samples. For example, one may classify the indlAd-dual samples and then 
aggregate the classification to obtain a proportion estimate. In the 
simplest version of Procedure B all of these steps are combined in a 
2-part procedure of selecting training and estimating proportions .. The 
procedure can be viewed conceptually from a variety of perspectives as 
will be indicated later. First, however, we will simply describe the 
baseline procedure as it stands . 

Table 2 outlines the two parts of the procedure, and these are 
described in more detail in the following sections. Referring to Table 2 , 
the first step in training seleetibn is to group together all of the 
blobs from the segments being processed, and perform an unsupervised 
Glustering of the blobs, based upon the spaetral components of the blob 
feature vectors. The result of this ciusterlng is to perform a spectral 
stratificatibn of the data. Every blob is assigned to some specific 
cluster. Next, a two-way table is made j SAgmant by cluster, num- 

ber of blobs in a partieular cluster in a parjticular segment is the 
entry in the table. Using this table, combinations of segments are 
chosen and evaluated V An attempt is made to find the combination of seg- 
ments which will contain blobs in more spectral clusters than any other 
combination of segments- 

fmtm 
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TABLE 2. PROCEDURE B OUTLINE 

Section 

2 . Image Preprocessing/Feature Definition 

3. Procedure B Training Selection 

- Select segments and fields (blobs) within those segments 

3.1 Segment Selection 

" Cluster the blobs in all segments, creating spectral 
strata 

- Select segments so as to contain blobs which cover all 
the spectral strata 

3.2 Field Selection 

- Randomly select blobs from those which can potentially 
be used to cover all the spectral strata 

4. Proportion Estimation 

- Identify the selected fields, estimate the proportion of 
wheat in each spectral stratum, and aggregate the total strata 
proportions 

4.1 Identify selected blobs 

4.2 Estimate the proportion of wheat in each spectral stratum 

4.3 Make a proportion estimate for each segment and for the 

entire group of segments . 

Having selected training segments it is then necessary to select, 
as training samples, individual blobs Within those segments- The total 
number of blobs to be selected is first spread out among each of the 
spectral strata (since it is desirable to have more training from larger 
spectral strata). The allocated number of training blobs for each spec- 
tral stratum are further allocated to the selected training segments. 
Within each training segment/spectral-stratum combination the training 
blobs are drawn randomly from those available. 
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3.1 SEGMENT SELECTION 

Several auxiliary concepts will help in describing the segment 
selection routine. 

Number Of Segments To Select 

The number of training segments to be selected is a parameter of 
the procedure, say, M. M can range from one to the total number of 
segments. Typically we have been working with 4-6 training segments. 

Pairwise Search Pattern 

The optimal way to choose the set of M training segments is to 
evaluate all combinations of the K segments available for training, 

M at a time. The computer time required to accomplish this is un- 
reasonably large. At the other extreme the segments could be selected 
stepwise one at a time, in a manner similar to a stepwise regression 
procedure. Thus, one first picks the best single segment. Then given 
that segment, one picks the next segment to give the highest value of 
the criteribn function, etc. This procedure usually would give 
reasonable results at minimal cost. A compromise which is guaranteed 
to give better results but is still quite efficient is tp pick the 
best pair of segments, then the best Gomplementary pair and so on 
until M segments have been selected. The program actually developed 
allows the user to specify the number of segments to be selected at 
each sequential step. 

Criterion Function 

Initially we can think of an evaluation or criterion function 
which Increments by one the first time that a spectral stratum becomes 
represented in the process of choosing segments. Thus, if there ate 
50 spectral strata, the maximum value the criterion function might 
achieve is 50. However, additional considerations have been built into 
the criterion function. 
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Lower Bound 

If a segment only contained a single blob which represented a 
certain spectral stratum, it might be considered an accident. It 
might turn out on subsequent examination that the single blob was ambig- 
uous in its definition or too small to be clearly defined. Thus, we 
might not want to claim the spectral stratum had been represented if 
only one (or indeed, only a few) blobs were chosen, Hence, the "lower 
bound" is a parameter of the selection program, and is under user 
control. We have generally operated the program with the lower bound 
equal to 2, which means that 2 blobs roust be present in order to con- 
sider a spectral stratum to be represented. 

Extra Weight 

Even if a fairly large number of blobs is available to represent 
a particular spectral stratum from one training segment, one still 
might feel it is desirable to obtain some additional representation 
from another training segment. In particular, a spectral stratum 
might consist of both wheat and nonT-wheat, with wheat dominating some 
segments and no n-wheat dominating others, and picking more than one 
segment would improve the chances of representing both cases. In order 
to accommodate this desire the criterion function allows the user to 
set weights on the first, second, third, etc., occurrence of tepresenta- 
tion of a spectral stratum by a selected segment. Most usually we 
have been assigning successive weights of 100, 20, 4, 0, to successive 
representations , 

3.2 TRAINING FIELD (BLOB) SELEeXION 

la the process of selecting segments, knowledge of the number of 
blobs in each potential training segraent/spectral stratum combination 
is required. The result of the segment selection is to choose a cer- 
tain number of blobs for training from each selected training segment/ 
spectral stratum combination. Within that eorabination the required 
number of training blobs are then chosen at random. 
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Randomlzatioii Of Blob Selection 

The random selection order of blobs is actually introduced earlier, 
prior to the spectral clustering step. At that point all the blobs 
chosen to be clustered, from all the segments, are placed in a random 
order, so that no particular sequencing over segments becomes associ- 
ated with the clusters which are produced. Prior to randomization the 
segment number is associated as a tag with each blob. After the spec- 
tral stratification is complete, the blobs must all be sorted by seg- 
ment and spectral strata in order to select training segments. However, 
this sorting out is independent of blob number so that a random order 
is still maintained within each segment/ spectral cluster combination. 
Therefore, blobs are merely chosen sequentially from this randomly 
ordered list. 

Big Blobs vs. Little Blobs 

Earlier, when we described the process of blob feature extraction, 
it was mentioned that, for training purposes, it is desirable to use 
only ''pure" pixels. Pixels which have one or more neighbors belonging 
to a neighboring blob are therefore tripped", i.e., labeled as a blob 
boundary pixel. Only those blobs which have oite or more non-boundary 
pixels are used for training purposes, and the blob statistics (spectral 
mean, vector) are computed only over the non-boundary pixels. 

At this point it is well to mention that the spectral stratifica- 
tion is established by clustering only the big blobs. The small blobs 
are classified into these strata and are used in proportion estimation, 
but do not enter into the training selection. 
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PROPORTION ESTIMATION 

As was mentioned earlier, there are a number of reasonable methods 
of estimating proportions once training fields have been established. 

Here we describe the one currently used with Procedure B. The basic 
approach Is to identify the selected blobs, use these Identifications 
to make a proportion estimate for each spectral stratum, and then extend 
these estimates over all of the segments. 

4.1 IDENTIFY SELECTED BLOBS 

The identifications to date have been based on ground truth maps 
provided for r^GlE Phase II Blind Sites in Kansas. In an operational 
version of Procedure B, one would have analysts provide blob labels. 
However, in a research version there are two main reasons for using 
ground truth labels. First of all, it is desirable to separate errors 
created by the procedure under development from errors which might be 
introduced from mistaken labels . Secondly , it is desirable to have 
exhaustive labels (for all the blobs in the sites) available, for a 
reason which follows, and the only source of exhaustive labels is the 
blind site ground truth. 

The reason for having exhaustive labels available is in order to 
examine a variety of parameter settings and to introduce random repli- 
cates of the procedure for testing and evaluation. In an operational 
environment the analyst would be called upon to identify a specific set 
of blobs for a single iteration of the entire procedure. Even though 
this Is qxiite a bit of work, a large training gain could still be achieved 
relative to training and classifying every segment . But in the research 
environment it is less total effort to identify all of the blobs once 
and for all, prior to running the entire procedure many times ^ 

An additional reason for ejdiaustive labeling is to be able to 
systematically locate any sources of error In the development of the 
procedure. 
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T&chnlques of Loeatina Identifying BIqIjs front Ground Truth 

Several approaches have been used in the process of establishing 
the identity of blobs from the ground truth. The current standard is 
described as follows. 

Available Information 

The ground tiruth identifications are made available to us in the 
form of a dark and light contact print (ammonia process) of the original 
ground truth aircraft photos. Normally these are acquired from an 
altitude such that 4 photos are needed to cover a 5x6-mile segment. 

/e 

Each photo is at a slightly different scale and a slightly different 
camera angle. Analysts at JSC have previously established and drawn 
the sample segment boundaries on an overlay o'" the photos and the 
fields labeled as wheat have also been outlined, fields are labeled 
as wheat or as other major crops on the overlay. 

The liSS data is processed by us into a variety of forms and pre- 
seated on a line printer output at 8 lines per inch (producing an image 
at approximately 1:24000 scale). The variety of forms includes gray- 
maps, blob maps and stripped blob maps. 

The next step is to transfer the ground truth wheat field bounda- 
ries accurately from the separate photos onto an overlay to the line 
printer maps. Having done that, each big blob which intersects a wheat 
field is identified and the propor<-,.on of it which is wheat is estimated 
If for any reason there is doubt as to the identity, a special symbol 
meaning "ambiguous" is used. These estimates are entered into a list 
which is stored and can be searched by computers. Any big blob hot on 
the list is automatically considered to be non-wheat. Subsequently 
for any particular run of the segment and field selection procedures 
the identity of the selected fields is obtained automatically from the 
computer and the proportion estimates are made automatically. 

The several types of line printer maps are used at various stages 
of the transferring and estimation process. The graymaps are used 
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primarily to find the positions of the mile roads in the image. Two 
overlays with a prepared mile road spacings are used and ad-justed to 
the best fit position by eye. The intersections of the mile roads are 
then recorded and used as reference points in the ground truth photos. 
Figure 6 shows a graymap of this type with the mile roads drawn on. 

The particular graymap algorithm used for these maps is chosen to 
have a reasonable fidelity to the photo tones, but especially seems to 
heighten the contrast along the road network. The mapped quantity is, 


201 

- "5^^ o 

, if Y^ >1 

G , 3 

G 

180 


, o therwise 


whera 

Y_ „ is the green feature in the third hlowindow (i.e.^ the 2nd 

^ n M 

component of R X , where X is the corrected Lands at 
counts vector) 

Y is the brightness feature in the third biowindow. 

The quantity ra is mapped such that the lightest areas on the map 
correspond to the largest values of m. With this rule, green fields 
appear as dark to medium areas on the map. Fields which are not green 
enough to be interesting are mapped in respect to brightness, and appear 
from medium to light areas on the map. 

Next the photo copies are mosaiced together (approximately, leaving 
an overlap at the boundaries) and the mile road grid Is drawn onto eacb 
of the separate quarters of the mosaic. A reference intersection near 
the center of the 5x6-mile s^ple segment is clearly marked on both the 
photo copies and on the graymap overlay, so that any intersection can 

be easily located on both media. 

★ 

The basic technique of using mile road grid Was developed by S. 

Lindner. 

'Two overlays are required since the angle between N-S roads and 

E“W roads as seen in Landsat data is a function of the latitude. 
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From this point on several different procedures have been used, 
and we will describe two of these. 

In the first procedure, the field boundaries and prominent scene 
features are transferred freehand to the graymap overlay. This is accom- 
plished using the mile road grid as a guide. This procedure is relatively 
quick and is accurate to within a few pixels which is sufficient for areas 
of large fields. The overlay is then placed on top of the stripped blob 
map and those blobs which intersect a wheat field are labeled. 

The blobs themselves are numbered in the process of producing them, 
and this number is coded into a symbol for the blob which is used In 
making the blob map. Two maps are used, one containing alphanumeric 
symbols corresponding to the numbers 1...50, and the other containing 
the symbols 0, 0, 1, X, 0 corresponding to 0, 50, 100, 150,..., 950. 

Together these cover the range 1 to 1000 and if there are blobs -with 
numbers larger then 1000 they are easily identified by the portion of 
the map in which they fall. Figures 7 and 8 show a pair of blob maps 
of the types just mentioned. 

The second general approach is to digitize the mile road inter- 
sections as control points and then digitize the vertices of all of 
the wheat fields on the photo copies. The digitized points can then 
be transformed to line printer coordinates through a computer program 
and an overlay can be drawn using an automatic plotter. The plot can 
be overlayed onto the blob maps as before and the blobs labeled. This 
method is, in principle at least, somewhat more accurate than the free- 
hand sketching method, but it is also more time consuming. In either 
case the task of determining the Identity of individual blobs is still 
the same. 

One potential advantage of the digitizing method is that a com- 
puter program may be written to automatically determine the label for 
each blob from the digitized Information. This would be especially 
valuable if different blob maps were to be made of the same region for 
testing purposes , 
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lu order to maintain perspectii/e we note again that this discussion 
all applies to the reseax'ch environment. The operational environment 
for the use of blob and Procedure B in general is quite different and 
in many ways less complicated. We will discuss the possible operational 
implications in Section 6. 

4.2 ESTIMATE THE PROPORTION OF WHEAT IN EACH SPECTRAL STRATUM 

The method of estimating the proportion of wheat in each spectral 
stratum is to divide the total miciber of wheat blobs among the sampled 
blobs by the total number of wheat plus other blobs. '‘Ambiguous" blobs 
are excluded from either numerator or denominator. (If a blob is 0.80 
wheat then 0.8 is added to the numerator and 1.0 is added to the denomi- 
nator.) 

In symbols , 


A. = 7] b /N. 

1 4_j i j X 


where 


is the estimated proportion wheat for the i^^ spectral stratum 

b - is the estimated proportion o± wheat in the j blob among those 

Lll 

chosen for training the i spectral stratum 
is the number of blobs in the training set foi^ the i*"^ 

spectral stratum. Note that this set can extend over several 
segments . 

4.3 ESTIMATE SEGMENT PROPORTION 

The estimated proportion for each segment is calculated from the 
proportions of the spectral stratum of which it is comprised, as follows: 


P = E V. n, /L n. 

S A*- X ±sl*—‘ XS 
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where 

/S. 

is the estimated proportion wheat in segment S 

n^g is the number of pixels contained in blobs of segment S which 
belong to spectral stratum i. 

We have now completed the description of the simplest version of 
Procedure B. In a later section we will discuss modifications and im- 
provements to this basic system. In the next section. Section 5, we 
discuss results for the base line procedure carried out on 17 Blind 
Test Sites in Kansas. 
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RESULTS WITH BASELINE PROCEDURE B 

An Initial demonstration of Procedure B was carried out using manual 
procedures at all steps after the spectral strata clustering. In 
particular, the segment selection was carried out by lining up markers 
on the rows and columns of a large sheet of paper displaying the array 
of blobs contained in each spectral- stratum/segment combination. The 
required nimber of blobs for each combination were established roughly 
proportional to the square root of the total number of blobs in each 
spectral stratum, and roughly proportional to the number available for 
training in that stratun from each training segment. The required niimber 
of blobs were identified from a stripped blob map such as Figures 6 and 
7, but without the assistance of a good greymap. The blob proportions, 
and the spectral stratum proportions were tabulated manually. The seg- 
ment proportions were tabulated by a program which has since been 
superseded . 

The initial manual operation was demonstrated on 17 sample segments, 
using six of them for training. The segments are those *75- *76 blind 
sites in Kansas for which acquisitions were available in or near the 
first three biowindows. The procedure as initially implemented did not 
Incorporate ancillary data, either in a partitioning step or as a part 
of the classification methodology. 

The performance of Procedure B on the 17 s^ple segments is summa- 
rized in Table 3, and in Figure 9v In the table the training segments 
are marked with an asterisk. The' .overall performance is Indicated by 

the bias figure, 2.16%, and the coefficient of variation, which is the 

'W ■ 

standard deviation of the error in percentage estimate, 9.12%, divided 
by the true percentage 20.93%, for a c.v. of 0.44, 

The standard error of the overall bias estimate is 9.12/v^ 2.21. 

Because the overall bias Is less than this, it is not statistically 
significant. 

' I 

|>^|i ILAMK iKyp iritmi] 
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TABLE 3. PERFORMANCE OF PROCEDURE B ON 17 SAMPLE SEGMENTS 


Segment 

Estimated 
Percentage, E 

True 

Percentage, T 

E ^ T 

1020 

J 

23. G 

25.29 

- 2.29 

*1035 / 

19.65 

17.53 

2.12 

*1041 

11.1 

14.39 

- 3.29 

*1154 

28.8 

34.8 

- 6.00 

1163 

16.0 

8.72 

7.28 

1165 

29.95 

6,51 

23.44 

*1167 

l1,1 

8.03 

9.67 

1180 

36.0 

26.36 

9.64 

1851 

24.5 

25.0 

- 1.00 

*1852 

26.7 

22.29 

4.41 

1860 

18,6 

24.8 

- 6.20 

*1861 

41.9 

34.39 

7.51 

1865 

15.2 

20.4 

- 5.20 

1883 

28.2 

14.4 

13 . 80 

1886 

14,55 

28.85 

-14.30 

1887 

12.8 

10.9 

1.90 

1891 

28,4 

33.18 

- 4.78 

Average 

23.12 

20.93 

2.16 

a 

8.54 

9.32 

9.12 


C. V. = 0,44 


If we have trained sufficiently, the performance of the six train- 
ing sites ought to be Indistinguishable from the performance of the 
11 recognition sites. Table 4 shows summary performance for these cases 
The performance for the training segments has slightly higher bias and 
somewhat smaller standatd deviation. 


Estimate 
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Key: • Training Segment 

* Mean OF Estimate versus Mean of True 


FIGURE 9. ESTIMATED PROPORTION VS. TRUE PROPORTION FOR 
17 BLIND SITES IN KANSAS 
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TABLE 4. PERFOBMANGE SUMMARY FOR TRAINING AND RECOGNITION 
SEGMENTS CONSIDERED SEPARATELY 


Average True Percentage 

Average Estimated 
Percentage 

Bias 

Standard Deviation 

Standard Deviation 
of the Mean 

Coefficient of Variation 


6 Training 

11 Recognition 

Sites . 

Sites 


21.91 20.40 


24.31 

22.43 

2.40 

2.03 

6.10 

10.69 

2,49 

3.22 

0.28 

0.52 


In both cases the bias is within the error of estimate of the mean 
that one would expect from these few samples, so there is no significant 
bias for the two cases . Neither is there a significant difference between 
the biases. 


With regard to the standard deviations, an F test of the ratio of 
the two variances shows that they are not significantly different at the 
5% level. 

The coefficient of correlation between the estimated and true 
proportions for all sites Is 0.48, which is significantly different from 
0 at the 5% level. 

These results can be Interpreted as a moderate success for Procedure 
B, one which suggests that variations on the procedure ought to be 
systematically explored. Therefore, many aspects of the procedure have 
been automated and many runs have been made with varied parameters. So 
far none of those runs has duplicated the success obtained on the manual 
exercise. 
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Figures 10 and 11 are examples obtained for two different cases. 
Figure 10 is a three-pass case with 17 sample segments and five training 
segments. Figure 11 consists of ten carefully chosen segments (i.e., 
closely matching acquisition dates) and three training segments. 

At first it was thought that a particular training segment which 
was missing in Figure 10, compared to Figure 9, was the cause of the 
trouble. However, it has been found that the results do not appear to 
be particularly sensitive to the Ghoice of individual segments. Subse- 
quently we have checked many aspects cf the automated procedure and the 
identification of ground truth. Errors nave been found but none which 
constitute an explanation. 

A remarkable consistency of pattern is present among all the 
results. Certain segments whose ground truth proportion is low have 
very high estimates. Certain segments whose ground truth proportion is 
high have very low estimates. The same segments display the same hias 
over a variety of parameter settings. Thus Figure 11 looks like a 
random sample drawn from Figure 10, 

If we coull throw out these few bad points the overall correlation 
looks quite reaf>onable, but there is no rational basis for throwing out 
samples in the absence of ground truth. [Appendix IV, added in proof, 
gives an explanation for these bad 'points and shows how they can be 
brought into line by the use of an ancillary variable.] 

There remain two differences, which have not yet been checked, 
between the manual exercise and the subsequent automatie ones. The 
manual exercise was conducted on blobs generated using four acquisitions, 
whereas the automatic exercise has, been conducted on three-acquisition 
blobs. Secondly, there were many potential sources for error in identi- 
fication of the ground truth during the manual exercise, and so only 
blobs which were obviously identifiable were used for training. In many 
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TRUE PERCENTASE 


FIGURE 10. ESTIMATES VS. TRUE FOR 17 SEGMENTS AND 6 TRtUNING SEGMENTS, 

69 SPECTRAL STRATA 



TRUE PERCE 


FIGUKE 11. ESTIMATES VS. TRUE FOR A S 
CLOSELY MATCHING BIOWINDOWS 
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cases this meant that only blobs with a fairly large number of pixels* 
even after stripping, were used. At the present time these last two 
sources of explanation are being studied. 


yERIM 


FOWMERLY willow run LARORATORIFS. Tfie UNIVERSITY OF MiCHli^AN 


6 

EXTENSION TO THE BASELINE PROCEDURE B 

In the foregoing we have described the simplest version of Proce- 
dure B as of the date of this report. Two major topic areas have oc- 
cupied our interest in the way of extensions to this basic procedure, 
namely ancillary data dependence and operational system capabilities. 

6.1 ANCILLARY DATA DEPENDENCE OF SIGNATURES 

We outlined, in Section 1, the sources of variability in crop 
signatures. These Included: external physical effects which can be 
accounted for if the parameters driving them can be measured (as in 
the cases of the satellite type, the sun zenith angle and the amount 
of haze) , also included are random sources of variation which can be 
averaged over (as in the case of the training selection portions of 
Procedure B); and finally, systematic effects which are dependent upon 
ancillary variables which are measureable, but the magnitude and form 
of the dependencies are unknown initially. It is these last effects 
which we wish to discuss at this point. 

Candidate Ancillary Variables 

Ancillary variables fall into several main categories, as shown 
in Table 5. 

The site specific ancillary variables describe permanent features 
of a segment. Pass specific variables describe conditions which can 
vary from pass to pass and among these we distinguish those that can 
be measured from the satellite image data itself, and those that can 
be measured from other sources such as meteorological satellites or 
an ephemeris. 

Conceptual Model For The Use Of Ancillary D ata 

Two general ways have been proposed for handling the dependency 
of signatures upon ancillary data, partitioning and signature modeling. 
First, a few words about partitioning. 
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TABLE 5. GROUPING OF ANCILLARY VARIABLES 
SITE SPECIFIC VARIABLES 

— Average growing season temperature C<legree days) 

- Average gro'/ing season rainfall (inches) 

- Latitude/ longitude/ground elevation 

- Land use category 

- Soil classification 

PASS SPECIFIC VARIABLES (EXTERNAL) 

- Sun zenith angle/scanner viewing angle 

- Crop calendar 

- Accumulated degree days 

- Accumulated rainfall 
“ Satellite 

PASS SPECIFIC VARIABLES (INTERNAL, DIAGNOSTIC) 

- Soil mean 

- Green arm mean 
~ Haze diagnostic 

- Percentiles of green development 


The general Idea of partitioning is that one finds an area or 
domain of the partitioning variables over which signatures are essen- 
tially constant. Within this doniain training and proportion estimation 
takes place. 

An assumption Is that such clear strata exist. There are cases 
where strata are divided by obvious boundary lines, but more generally 
there is a smooth continuum of partloning variables, and the actual pos 
tion of partition boundaries is somewhat arbitrary. 
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Secondly, in order to do an adequate job of partitioning one would 
like to include all of the possibly significant ancillary variables 
into the definition of a partitioning rule. As the number of variables 
is Increased the size of partitions tends to grow smaller and smaller, 
even though there also is much correlation among the partitioning 
variables, either in their occutrences , in their effects, or in 
both. The tendency of partitions to get smaJler can be overcome by 
taking account of these correlations, but this in effect requires a 
signature model. 

In Appendix HI we develop a viewpoint which encompasses both an- 
cillary variable dependence and partitioning. Partitioning is explained 
in terms of signature models, and this is intended to put a rational 
framework behind partitioning. 

We think of the ancillary data as conditioning the likelihood 
functions for the various crops which may be present in the scene. 

Then we Imagine that classification or proportion estimation is 
carried out in a conventional way using these conditional likelihoods. 

A partition is a special case in which the values of some of the 
ancillary variables are bounded and an average is taken over the bounded 
range. 

Results With Ancillary Variables 

We have attempted to incorporate ancillary variable dependence into 
Procedure B, but have not achieved acceptable performance to date. 

We believe the primary reason has been that we have not taken 
proper account of the dimensionality of the ancillary data space, 
whether this is expressed by saying that we haven't yet selected the 
"correct" ancillary variable function, or by saying that the ancillary 
variables are highly correlated in their effects. Appendix III. 3 exam- 
ines this question of dependency in a mathematical sense. The important 
conclusion is that the intrinsic dimensionality of the ancillary data 
space is never more than the number of spectral signature parameters 
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which must be estimated. However, if this is simply the mean bright- 
ness and green for three passes for a single crop (i.e., wheat) that 
is 6 degrees of freedom. Thus, a minimal number of sites (i.e., dif- 
ferent ancillary variable conditions) which might be used to estimate 
these 6 parameters is about twice as many or 12 of them. 

The ancillary variable problem, the problem of signature dependence 
and the problem of signature extension are all fundamentally the same 
problem. 

We have analyzed the signature extension problem as having three 
fundamental components, (external effects, random effects and ancillary 
variable dependent effects) each of which can be attacked by a separate 
and distinct methodology. Over the next two or three years the relative 
importance of these various components will become quantified. The 
relative balance will be different for different survey problems. It 
will probably be found that winter wheat is one of the easiest problems 
in regard to ancillary variable dependence and that in order to attack 
more difficult problems a systematic crop signature model will be a 
necessity. This in turn is going to require: 

a. an ongoing, centrally organized, comprehensive, well controlled 
field measurements program to obtain information on the depend- 
ence of crop signature on ancillary variables 

b. a comprehensive modeling effort to relate the crop signatures 
and ancillary variables 

c. a large scale ground truth effort to obtain empirical signa- 
tures as a function of a wide diversltjr of random and sys- 
tematic effects 

d. an empirical operational model or experience organizer, i.e., 
a systematic methodology for converting ground truth and 
analyst Interpretations into a computer signature model. 
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6.2. OPERATIONAL SYSTEM CAPABILITIES 

In order for any system to become operational there are certain 
capabilities it must incorporate. Perhaps the most important Is the 
ability to make timely estimates in the presence of missing or partial 
data. A desirable, although non-essential capability is to choose train- 
ing efficiently in a time sequential manner. As of the writing of this 
report both of these capabilities are in a developmental stage. The 
next two subsections explain these developments In more detail. 

6,2.1. THE PROBLEM OF MISSING ACQUISITIONS 

Suppose data has been acquired for the first three biowindows for 
some segments; for other segments, however, data has been acquired only 
for various combinations of two of the three biowindows, vdiile for 
still others data is available for only one of the three biowindows. 

In its simplest form Procedure B could be applied to each of the 
sets of segments separately. However we tfouid expect a greater net 
training gain and less potential for confusion in an operational system 
if Procedure B could be extended to operate on all the segments even 
with missing acquisitions. A method of accomplishing this has been 
devised and is described as follows. 

a. During the process of clustering of blobs to produce spectral 
strata the blobs which have a full complement of acquisitions are 
clustered first. After the full clusters have been estab- 
lished the clustering continues using the blobs from segments 
with missing acquisitions. 

b. A blob containing partial acquisitions is assigned to a full 
acquisition spectral stratum based just on the available 
spectral coordinates. For example; 
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If three passes 

Three-pass 


are available; 
mean, 



Three-pass covariance, 




Distance measure Is 

(X - (X - P) 


If only passes 1 and 2 are available, ignore the third component 
of the pixel vector and strike out the third row and third column 

of L 


c. If the partial acquisition blob is not near enough to any of 
the existing spectral clusters, a new cluster is defined 
involving only the partial acquisitions. 

d. After the clustering is complete the segment selection proce- 
dure Is exercised as before, the main difference now being 
that full complement blobs are counted as belonging both to 
the full complement spectral strata, plus all subsets of them, 
(see Table 6 ) Thus, segments which have a full set of acqui- 
sitions are represented in a large number of strata and are 
thus more likely to be chosen for training. 

e. With these modifications the estimation procedure Is carried 
out just as before. 


V 


t 


i 


I.. 1. . .. :. ..l . ... . ■ J . 
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TABLE 6. EXAMPLE BEPKESENTATION OF SUB-STRATA. THE ENTRIES 
IN THE TABLE ARE THE NUMBER OF BLOBS REPRESENTED 
BY THE SEGlffiNT /SPECTRAL STRATA COMBINATION 

1 

1. 2, 3 5 

1. 2. - 5 

1, 3 5 

2. 3 5 

1, - 5 

2, - 5 

3 5 


6.2.2 THE SEQUENTIAL SELECTION PROBLEM 

In an operational system it will be desirable for various reasons 
to use the same segments for training on successive acquisitions, to 
the extent possible. It may be that the identifications of training 
fields from an earlier biowindow may still be valid at a later time and 
so analyst effort can be saved. There may be some benefit from the 
analyst developing familiarity with a segment so that the segment is 
easier for him to work the second time around. The segment selection 
program could be modified to include weights on the individual segments 
so that, other things being equal in the choice between two segments, 
the one which had previously been selected for training would be selected 
again. 

A further possibility exists. Could not the program be modified so 
that, other things being equal, the segment would be chosen which had 
the highest probability of being available on a subsequent (l.e., future) 
acquistion? In order for this to be a worthwhile capability to imple- 
ment, there must be a means of estimating the probability of future 
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acquisition. One means depends on whether the segment is in the overlap 
zone between successive day passes of the satellite. If it is, the pair 
probability of at least one acquisition is nearly double the single day 
probability. This factor could be further modified by the position of 
the window relative to the 18-day overpass cycle of the satellite. The 
second factor is the quality of the multitemporal registration reference 
pass. This affects the probability of being able to register the data 
multi tempo rally for a given degree of cloud cover. These factors are 
subject to determination or estimate, so that a table of acquisition 
probabilities for each segment for each biowindow can be produced and 
updated each time the Procedure B segment selection program is to be run. 
(Note that the entries into the probability table are either 0 or 1 for 
past events, i.e., the acquisition either failed or was successful.) 

The future event entries in this table can vary by as much as a factor 
of two for a short biowindow. 

Now the value of selectiag a segment for training in the current 
acquisition is related to: 

a. whether it was acquired or not in a previous bxowlndow or 
combinations of them. 

b. whether it was used for training previously, 

c. what the probability of acquisition is for future biowindows 
or combinations of them. 

In order to simplify the problem somewhat we will assume that the segments 
have a training value which is proportional to the number of acquisition 
combinations covered by the segment, namely 2 ” - 1, where n is the 
number of acquisitions. Thus, the expected value of a segment is equal 
to the probability of being acquired n times (by end of season) times 
2*^ - 1, summed over the values of n. And the probability of 
being acquired n times is a sum over the different ways of acquiring it. 
Thus, for a case where a maximum of three biowindows are being considered 
the expected value of a segment is 
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EV = (23 - IHP 1 P 2 P 3 ) 

+ (2^ - 1)(P1P2[1 - P 3 ] + Pl[l - P 2 I P3 + [1 - Pi] P2 P 3 ) 

+ (2^ - l)(pi[l - P 2 IU - P 3 I + [1 - Pll P2[l - P 3 ] 

+ [1 - Pll [1 - P 2 ] Ps) 

where pj^, P 2 * and pg are the entries in the probability table. If for 
example, this is the second acquisition we might have pj^ = 0 , p 2 = 1 and 
P 3 = 0.25 giving 

EV = 0 + 0.75 + 0.75 
= 1.5 

Given two segments having equal actual training value as determined 
by the currently available acquisitions we will want to decide between 
them based on their expected value. Since we don't know in advance the 
trade off between training efficiency and EV we will have to define a 
relative weight between these factors and adjust it empirically, A 
convenient form is 


Total value = (TV) * (1 + k* (EV)) 


where 


TV is training value 
EV is expected value 
k is a relative weight factor 

This function is being incorporated into a revised version of the 


segment selection program. 
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CONCLUSIONS AND RECOMMENDATIONS 

Thus far the results achieved, i.e, , proportion estimation per- 
formance, using Procedure B are inconclusive. However, in the inunediate 
future we will continue to search out and correct possible sources of 
error in the procedure. 

Certain statements ought to be made irrespective of the outcome of 
that evaluation. 

The preprocessing/ feature extraction structure we have set up appears 
to be stabilized. The structure is transportable in the sense that the 
ideas are simple and cau, and have been, implemented by other researchers. 
There will, no doubt, be further improvements in the correction algorithms 
and these will be incorporated as they become available. But we believe 
that the present implementation forms a solid working structure. 

We have characterized the signature extension problem in terms of 
three main sources of variation in signatures; 

1. systematic external effects applicable to all crops 

2. systematic ancillary variable effects on Individual crops 

3. apparently random variation between segments which appear 
identical in terms of the ancillary parameters being observed. 

The value of this taxonomy of the problem is that three distinct 
methodologies are available to attack the three aspects of the problem, 

i.e. , 

1. physical models combined with regression using unlabeled samples 
over a variety of conditions of measurement 

2. signature studies on individual crops, using both canopy models, 
field measurements and empirical regression against the ancil- 
lary variables 

3. multisegraent training and statistical analysis of sampling 
strategies for training sufficiency and efficiency. 
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Evaluating progress in these areas in the SR&T community, we can say 
that much progress has been made in the first area. Specific improve- 
ments that are still needed and possible are discussed in [5]. In 
the second area, a comprehensive and sustained approach to the ancillary 
variable dependent signature problem is needed, as outlined in Sec- 
tion 6.1, 

In the third area, we regard Procedure B as a prototype framework 
in which questions of a statistical nature concerning signatures, sig- 
nature variability and sampling can be asked, and, we believe, answered. 
Procedure B appears to be soundly based, statistically, as a stratified 
sampling scheme. It shares with other signature extension methods a 
tendency for estimates to correlate well with each other, but less well 
with the ground truth. Thus a study of the sources of error in Proce- 
dure B will not only be expected to make the procedure work acceptably,, 
but also to strike at tbe heart of the signature extension problem. 

If there are hb significant errors in the application of the proce- 
dure, then what we have found is a bona fide example of attempting signa- 
ture extension over too large a partition (i.e., the entire state of 
Kansas). This would force the use of smaller partitions or the incorpo-r 
ration of ancillary data into the signature model in order to achieve 
acceptable performance. Appendix IV, added in proof, explores this 
possibility and presents some further results. 
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APPENDIX I 

A TRANSFORMATION TO MAKE LANDSAT 1 MSS DATA COMPATIBLE 
TO LANDSAT 2 MSS DATA 

(R. Kauth, J. Heradal) 


1. INTRODUCTION AND DATA BASE 

It was desired to estimate coeffieients of a transformation which 
would convert Landsat 1 data to Landsat 2 data. In order to make 
the estimate it is desirable to use the identical scene observed 
under identical conditions by both satellites. The nearest we can 
come in practice is to observe pairs of data over the same scenes 
separated, by nine days. An initial selection of about 25 such pairs 
was made» but natural attrition reduced this ultimately to eight 
pairs. As will be seen this is a quite minimal set and the fitting 
procedure should be repeated with a larger sample. 

Pairs were eliminated from consideration if either member contained 
"patchiness" of cloud or haae, as evidenced in the histogram output 
of program SCREEN, or if the haze Levels of the pair were thought to 
be markedly different from each other as evidenced by the yellow 
level of the mean of soils calculated by program SCREEN. The relative 
yellow level for each pass was judged against all non "patchy” passes 
for the same satellite by plotting a separate yellow level histogram 
over those passes for each satellite. The finally accepted pass 
pairs are listed in Table 1. Of the eight cases, four are eases in 
which the Landsat 2 pass precede the Landsat 1 pass. 


The data used for fitting was the four band "mean of soils" and the 
four band "mean of the green arm", both outputs of program SCREEN. 
These data and their average are shoxvn separately for each Landsat 
band in Table 2. 


2. MODELS 

Several different models were used for fitting. In general the models 
are of the form 
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TABLE 1. 

PASS PAIRS 

USED IN 

LANDSAT 1 : 

LAtlDSAT 2 PITTING 

PROCEDUPvE 





Sun Zenith 

Sun Zenith 

Case # 

Segment # 

LI Date L2 Date 

Angle, 0^ 

Angle, 0^ 

1 

;1020 

155 

164 

37° 

32° 

2 

1020 

101 

92 

47° 

46° 

3 

1154 

153 

162 

37° 

31° 

4 

; 1855 

82 

73 

52° 

52° 

5 

1855 

82* 

91 

52° 

46° 

6 

1857 

101 

92 

46° 

44° 

7 

1861 

101 

92 

46° 

45° 

8 

1882 

153 

162 

37° 

31° 


* Same pass used twice 
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TABLE 2. DIAGNOSTIC DATA OS ED IN FITTING 


Green Ai'm 



Soil 

Mean, YT 

Mean, 

WT 

Average 

Case 

LI 

L2 

LI 

L2 

LI 

L2 

1 

41,5 

41.7 

25.1 

24.4 

33,3 

33.05 

2 

40.2 

36.8 

28,0 

22.0 

34.1 

29.4 

3 

36.1 

36.3 

26.0 

23.6 

31.05 

29.95 

4 

34.2 

30.3 

26.9 

22,1 

30.55 

26.2 

5 

34.2 

36.3 

26.9 

26.3 

30.55 

31.3 

6 

39.7 

37.6 

28. 7 

25.0 

34.2 

31.3 

7 

43. 6 

41.2 

35.0 

32.0 

39.3 

36.6 

8 

34.9 

35.1 

25.6 

24.6 

30.25 

29-85 

1 

40,3 

47.1 

16.9 

25.4 

30.1 

36,25 

2 

39.6 

41.7 

24.3 

23.3 

31,95 

32.50 

3 

31, 3 

38.0 

16.6 

20.3 

23.95 

29.15 

4 

33.5 

34.3 

25.4 

25.8 

29.45 

30.05 

5 

33.5 

41.9 

25.4 

29.7 

29.45 

35.80 

6 

40.7 

44.8 

25.2 

28.8 

32.95 

36.80 

7 

46.3 

49.7 

36.8 

40.7 

41.55 

45.20 

8 

29.2 

36.4 

16.2 

20.6 

22. 70 

28.50 
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TABLE 2. DIAGNOSTIC DATA USED IN FITTING (Continued) 


Band 


6 


7 


Green Arm 



Soil 

Mean, YT 

Mean, 

WT 

Average 

Case 

LI 

L2 

LI 

L2 

Ll 

L2 

1 

40,5 

49.5 

4,15 

50.5 

33.30 

50.0 

2 

38.8 

41.7 

40.7 

40.5 

39.75 

41.1 

3 

32.8 

40.5 

46.3 

54.6 

39.55 

47.55 

4 

32.3 

34.3 

32.5 

33.9 

32.4 

34.1 

5 

32.3 

42.5 

32.5 

45.8 

32.4 

44 . 15 

6 

38.8 

45.1 

44.3 

46.1 

41.55 

45.6 

7 

44.5 

50.0 

40.6 

44.3 

42.55 

47.15 

8 

34.3 

41.0 

43.5 

59.6 

38.9 

50.30 

1 

17.1 

20.5 

22.0 

25.3 

19.55 

22.90 

2 

17.0 

17.5 

21.6 

20.1 

19.30 

18.80 

3 

14.0 

16.1 

25.8 

26.7 

19.9 

21 .40 

4 

14.5 

15.2 

16.5 

16.6 

15.5 

15.9 

5 

14.5 

18.0 

16,5 

22.3 

15.5 

20.15 

6 

16.9 

18.9 

24.2 

22.7 

20.55 

20.80 

7 

19.8 

20.9 

19.1 

19.3 

19.45 

20.1 

8 

14.8 

16.6 

23.1 

28.6 

18.95 

22.6 
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^2 = F(.ii, 0^, e^, X^) + 


where 

is the Landsat 2 signal 
is the Landsat 1 signal 

and the sun zenith angle of Landsat 1 and Landsat 2 

cases, respectively, 

a are a set of parameters which must be estimated 
S is an error 

Three specific models were tried, as follows: 

a. Ratio model assumes for each band, 
cos 0^ 

V 


x„ = A ^ 

I cos 0, 


i.e., that there is no difference in signal offset between the 
two satellites, and that the radiance returned from the scene 
is an inverse function of the cosine of the sun zenith angle. 


b. Offset model, assumes 


/ = A / \ 

\cos \cos 0^/ 


+ B 


c. Three parameter offset model, assumes 
cos 0 


^2 ^ 2^1 


... i- U O r) cos 8_ 

rr “ A„A " ^ B 

cos 0. 1 2 21 cos 0, 1 


where 


^1 responsivity of Landsat 1 

is the responsivity of Landsat 2 
B^ is the offset for Landsat 1 
B^ is the offset for Landsat 2 


thus 


cos 0^ cos 6„ 
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Model (a.) requires a fit to one parameter per band, (b.) requires two 
parameters per band, and (c.) requires three parameters per band. The 
residual error per band after fitting is shown in Table 3. The Model (c.) 
is considerably the best fit for band 7 and slightly better for the other 
bands. 


TABLE 3. RMS ERROR OF THREE MODELS 


Band 

Model a. 

Model b. 

Model c. 

4 

1.11 

1.05 

0.81 

5 

2.06 

1.60 

1.36 

6 

2.99 

2.24 

1.66 

7 

4.78 

3.39 

1.20 


In general we can write 

z = a + a,x + a„y 
o 1 i 


z = X 2 

cos 0 2 

V — '■ ■■ ■! 

cos 6 1 

COB 02 

y ■ “'i 

In order to minimize the noise of the observations of the soil and green 
arm points we used their averages from Table 2. Thtis, is the average 
LI in Table 2. Using these tabulated values, and regressing z upon x 
and y, gives the results shown in Table 4. 

Most times we will be interested in converting Landsat 1 data to appear 
as though it were Landsat 2 data at the same solar zenith angle. There- 
fore the model will simplify to the form 


and identify 
a = B„ 


a^-A 


“2 ' ■'"'l ■ 




V 


r 


I 


I 



x-2 ~ ^ 

where B = - AB^^ 

B is also given in Table 4, 

TABLE 4. EIEGRESSION COEFFICIENTS FOR THREE PARAMETER MODEL C 


Band 

a = B- 
o 2 

= A 

= -AB^ 


n 

4 

-19.48 

1.04 

13.69 

-13.16 

-5.79 

5 

-24.99 

l.GO 

26.18 

-26.18 

1.19 

6 

-74.70 

1.09 

71.79 

-65.86 

-2.91 

7 

-21.53 

0.82 

24.54 

-29.93 

3.01 
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APPENDIX II 
PROCEDURE B DATA FLOW 


Figure 1 is a schematic of Procedure B data flow and programs. 
Not all of the listed programs have been described in the main text in 
this report since portions of Che procedure are still under development. 
Following is a brief description of the purpose of each program. 


CONVERT: 

Transforms dat’ from universal to ERIM format. 

SCREEN: 

Flags pixels foL clouds /shadow, water, bad data; 
and computes diagnostics. 

MERGE: 

Combines data from separate passes and merges 
flags . 

CORRECT: 

Carries out pixel by pixel correction for 
satellite, sun zenith angle and haze, and 
tasselled cap feature extraction. 

BLOB AND STRIP: 

Carries out multitemporal-spatial feature 
extraction. Outputs a blob feature tape and 
a pixel tape. 

AGOV; 

Computes a metric for spectral clustering 
program (BCLUST). 

ROSSTP ; 

Sorts blobs into big and small categories. 
Prepares a big blob file to be randomized. 

I7RAND: 

Randomizes order of blobs over an entire 
partition or stated collection of segments. 

BCLUST : 

Clusters blobs, producing a spectral strati- 
fication of them extending across segments. 

BLIST; 

Produces working files for following programs. 

WGT: 

Segment and blob training selection program. 

PRGRP ; 

Calculates estimated proportion for each 
spectral stratum. 

PROP : 

Calculates estimate proportion for each segment. 

GRAF: 

Produces graphs. (Courtesy of The University 
of Michigan’s Aeronautical Engineering 
Department) 


I 














of these programs most are not of detailed interest to anyone 
outside of ERIM. The current versions of the screening and correction 
programs are documented in reference 5. Blob and BCLUST are documented 
from a users point of view in this appendix. The segment selection 
program, WCT, is documented in this appendix by means of an annotated 
Fortran listing. The linking and utility programs, PRORP, PROP and 
AEROjGRAF are actively being modified and would be quite dependent upon 
the particular computing installation, so they are not documented here. 
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APPENDIX 11,1 

USER DOCUMENTATION OF BLOB 
{W. Richardson) 


ROUTINE: 

BLOB 




VERSION; 

3.0 




DATE: 

25 March 1977 




PROGRAMMER: 

W. Richardson 




COMPUTER; 

7094 




LANGUAGE: 

MAD 




INTERFACE; 

A POINT Module 




PURPOSE: 

To group pixels into '’blobs’*, that is, clusters 
spectrally homogeneous and spatially contiguous 

that are 
or nearly so. 

COMMENTS : 

BLOB 3.0 is a rewrite of BLOB 

2.0, 

which was programmed by 


A. Pentland and was based on an algorithm devised by R. Kauth 
with the assistance of G. Thomas and A, Pentland ^3], BLOB 3,0 
runs 7 times as fast as the previous version. 


DESCRIPTION: 


The basic idea of BLOB 3.0 is the same as that of the previous version, 
namely, that the data is clustered (by an algorithm similar to A. Pentland 's 
CLUSTR) using spectral channels and two additional channels containing the 
line and point number. Each pixel is processed as follows : 

1. Find the current blob, say Blob j, that the pixel is closest to. 

2. If the distance from the pixel to Blob 3 is less than a param- 
eter T, include the pixel in Blob j and update the mean of 
Blob j. 

3. But if the distance is greater than or equal to x, start a new 
blob containing one point (the pixel being processed) whose 
mean is the channel values of the pixel and the line and point 
number. 
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4. The line and point coordinates are rotated to line them up with 
north-south and east-west, and a distance measure used whose 
spatial component allows for the presence of rectangular fields. 
The spectral component uses means and variance but no covariances 

The major differences between the new version of BLOB and the previous 
one are as follows. 

1. A bug was removed from the screening test, a test based on 
line and point coordinates to quickly remove from consideration 
many of the current blobs. Also, the new version eliminates 
line screening, which is not effective. 

2. The present version does not have a subset of channels feature, 
but Instead relies on the subset option in POINT or the module 
SUBCHN. This means that it is no longer possible to blob on 

a subset of channels and at the same time pass through all the 
data channels. This option could be put in again without much 
difficulty if needed. 

3. Whereas the previous version updated variances, the present 
version does not, except that line and point variances can be 
increased by updating if the control variable BIGGER IB. 

4. The present version puts out blob means on cards, as did the 
previous version. However, the most useful form of blob mean 
output is a tape of blob means in raultispectral format put out 
by STRIP 3.1. 

5. The spatial component of the distance measure in the old 
version was 

■. jii - , (P - p)^ 

V (var (var p)^ 

This measure was an attempt to encourage rectangular, rather than 
ellipsoidal fields. An equidistant contour is a rectangular- 
looking super-ellipse like a TV screen. 

The new version uses 

(I ~ 
var 2, 


■(p r_.£l^ 

* var p 


max 
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which is faster and has an equidistant contour than is a true 
rectangle, 

6. The new version uses a method of indexing blobs that keeps on 
the active list only those blobs that axe being used. The 
method is to set a variable SKIP (default 2) , and when SKIP 
lines go by without a blob being chosen by a pixel, the blob 
is considered completed and retired from the active list. Time 
is saved by considering only active blobs, and blobs that are 
long in the column direction are not cut off arbitrarily 
before they are completed. 


HOW TO USE: 

BLOB is a POINT module with control input in Step 1. A typical deck set- 
up Is like this; 

ID Card 

$EXEGUTE, DUMP, COMPILE MADE 
POINT. (BLOB.) 

E'M 
BLOB deck 
$DATA 

Blob control input 

PROCESS input specifying tapes and rectangle or file to be processed 
Next rectangle or file 

M0DE1=$START$* 

Blob control Input with some control values changed 
Rectangle to be processed, etc. 

The BLOB control input is in READ AND PRINT DATA format as follows : 


Variable 

Default 

Description 

NCHAN 

4 

Number of spectral channels used in 
the blobbing. 

VAR 

10. 

A vector of fixed variances of the 
spectral data values used as divisors 
in the distance function. Example: 



VAR(l) = 36., 9., 36., 9., 36., 9., 
36. , 9. , 9. 
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Variable 

Default 

Description 



VARL 

10. 

Variance of 

the 

line value. 

VARP 

10. 

Variance of 

the 

point value 



BIGGER 


TATJ 


MAXCEL 


OB 


9 . 


200 


When BIGGER = IB, line and point 
variances are updated, hut only if 
the updating produces a bigger 
variance than before. 


When the distance 


NCHAN 

L 


1 



VAR^ 


+ max 


(A - 
VARL 


(P - P)^ ‘ 
VARP 


from the pixel . . . , 

1, p to the nearest blob mean 
■ * ‘ » PnCHAN ’ » P is less than TAU, 

the pixel is assigned to that blob 
and the mean updated* Otherwise the 
pixel becomes the first member of a 
new blob. A later paragraph will 
discuss setting TAU, VAR, VARL and 
VARP. 


The maximum number of blobs on the 
active list. It does not make BLOB 
run any faster make MAXCEL small, 
because only those blobs being used 
are retained on the active list. If 
MAXCEL is too small, some pixels will 
not be blobbed because no more entries 
are permitted on the active list. 

(A message will be printed in that 
case . ) At the end of the run , the 
maximum number of blobs on the active 
list at one time will be printed. 

The best plan is to make MAXCEL as 
big as possible without running 
out of storage. 


■•■i ■••• - • I 


• ... .. ..... , ■; 
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Variable 

PRINT 

PUNCH 

CUT 

WHERE 


Default 

IB 

OB 

0 

5 


EXTRAGHAN 0 


SKIP 2 


BADCH 0 


OLDBAD OB 


Descrip t: ion 

A list of blob means is printed (if 
PRINT = IB) and/or punched (if 
PUNCH = IB) for all blobs except 
those with fewer than CUT pixels. 
When WHERE 4 5, the punched card 
images are written on tape IfflERE, 


The blob number is stored in two 
channels on the output tape, the 
first containing the integer value 
of the blob number/512 and the second 
containing the remainder, Tf 
EXTRACHAN (known as RACHAN in the 
program) is 0, the two channels are 
put out after the input channels in 
in the output data vector. But if 
EXTRACHAN is set =7, for example, 
then the blob number is stored in 
channels 6 and 7. 


If SKIP lines go by without a given 
blob acquiring a pixel, then the 
blob is considered completed. It 
is printed and/or punched and retired 
from the active list. 

When BADCH = 0, you rely on the QCAMS 
convention that a word QBAD in the 
title block designates a channel to 
flag bad data, but when QBAD is 0, 
then no flagging of bad data is done. 
If the BLOB user sets BADCH to some 
number > 0, then that chanhel is 
designated as the bad data flag 
overriding the QCAMS convention . 

When OLDBAD = IB, you assume the old 
QCAMS convention that QBAD > 0 and 
DATUM (QBAD) ^ 0 means that the pixel 
is bad data. In that case, the blob 
number is set to 0 and no furtheir 
processing is done on that pixel. 
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Variable Default Description 

When OLDBAD stays at its default 
value OB, then it is assumed that the 
flag channel was produced by SCREEN 
and ADDFLG. Then the decision rule 
is that 

1. if the only flagging on any pass 
is for water, the pixel is passed 
on to further processing 

2. if any pass is flagged cloud, 
the pixel is assigned to Blob 1 
without further processing 

3. otherwise the pixel is assigned 
to Blob 0 without further pro- 
cessing. 


A problem in using BLOB is the setting of the BLOB parameters 
VAR(l) . . . VAR(NCHAIT) , VARL, VARP and TAU. These values are relative; 
if all are multiplied by 2, say, the decision rule is unchanged. The 
size of TAU controls the number of blobs. If TAU is too small, a great 
many small blobs are created. If TAU is too large, then extraneous 
data is glued together in large blobs. The relative size of VARLj VARP 
and VAR determines the relative weight given to spatial, as opposed to 
spectral, homogeneity within blobs. In other words, values of VARL and 
VARP that are too large produce blobs that are like spectral clusters, 
scattering themseleves all over the map, picking up pixels that are spec- 
trally similar- if VARL and VARP are too small, then neighboring pixels 
will be grouped together with little regard for spectral similarity. 

The blobs will be nicely contiguous, but they won’t in general correspond 
to a pattern of fields characterized by spectral homogeneity. 

In BLOB runs on Kansas blind site data, R. Kauth and T determined blob 
parameters as follows. The channels chosen for blobbing were the 
Tasselled Cap channels brightness and green for four blophases and the 
yellow channel for the fourth biophase -- nine channels In all. The 
values of VAR(l) . , .VAR(9) were set by Kauth at 36., 9., 36., 9., 36., 9., 36 
9., 9. The basis of this guesstimate was a belief that a residual error 
variation arisitig from such causes as changes in planting density and 
atmospheric differences not completely corrected by XSTAR4 would produce 
a standard deviation of at least 3 units in each channel. More variation 
would be expected in the brightness direction because of the effect view 






I 


X 
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angle has on brightness and because of differences in soil brightness. 

So the brightness standard deviation was guessed to be 6. The important 
guess is that brightness has twice the standard deviation of the other 
channels; the absolute values chosen are immaterial. 

We next assumed that VARP = VARL. So that left two undetermined param- 
eters, VARL and TAU. We then chose one of the blind sites (1G41) and 
made a number of BLOB runs with different values of VARL and TAU. The 
first set of runs showed us the sensitive ranges to be explored with 
finer resolution in the second set of runs. We made a stripped blob 
map of each run and chose the parameter set that produced the best-looking 
map as judged b^ size of blobs, absence of large areas where the blobs 
were stripped away to nothing, and a discernible pattern of roads cutting 
up the landscape into fields. 

The values we ended up with were VARL = 6. and TAD = 30. This setting 
produced 573 blobs for Sj^te 1041 (117 lines, 196 points) and a maximum 
of 55 blobs on the active list. The same parameter setting was used 
for 40 pass combinations from 28 sites. The number of blobs varied from 
297 to 1537 and the maximum number on the active list from 44 tc 130. 
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APPENDIX II. 2 

USER DOCUIIENTATION OF STRIP 
(W. Richards ou) 


ROUTINE; STRIP. 

VERSION; 3.1 


DATE; March 1977 

PROGRAMMER: W. Richardson 


LANGUAGE: MAD 

SYSTEM; 7094/UMES 


INTERFACE; A POINT module to be used after BLOB 3.0 




PURPOSE: To distinguish boundary pixels of blobs (i.e., strip the 

blobs) and to put out a tape of blob means. 


COMMENTS: The part of the program that distinguishes boundary pixels 

closely follows version 2.0 of STR.IP written by J. Gleason 
and A. Peiitland with exceptions noted in the "Description" 
section of this memo. The part of the program that puts 
out the mean tape is new. 


SUBROUTINES 

NEEDED: Subroutines QAK. and SWAP are needed in addition to library 

routines. 


DESCRIPTION: 

STRIP puts out two tapes: the pixel tape and the mean tape. The pixel 

tape is the one put out by the POINT system, except for the last line, 
which is put out outside the system. It contains all the input channels 
plus an -additional channel (the strip channel) which is 0 if the pixel 
adjoins, horizontally or vertically, a pixel from another blob, and is 
1 otherwise. By mapping the l*s from this channel, we get a picture of 
the blobs with all the boundary pixels stripped away. If STRIP and MAPP 
are linked together in one call to POINT, the last line will he missing. 
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The mean tape contains the blob means and associated information. It is 
in muitispeetral format with an arbitrary line length of 120 pixels, 
each pixel representing a blob. The last line is brought to standard 
length by filling the unused spaces with zeroes. The first NDATA. chan- 
nels of each pixel are the blob mean vector computed from the Interior 
pixels (i.e., those not identified as boundary pixels) if there are 
any; if not, the mean is computed from all the pixels in the blob. The 
additional channels are 


Channel Description 

NDATA+1 Line mean . 


NDATA+2 


Point mean. 


NDATA+3 

NDATA-f4 


Blob number in two channels because it may be > 511. 
Channel NDATA+3 is the quotient after dividing the 
blob number by 512 and channel NDATA+4 is the remainder. 


NDATA+5 

NDATA+6 


Number of pixels in the blob put in two channels as 
with blob number. I have not observed an example 
where channel NDATA+5 is greater than zero. 


NDATA+7 


NDATA+8 

NDATA+9 


Number of interior pixels in the blob. If this number 
should happen to be greater than 511, it would be set 
equal to 511 when put out on tape. 

Segment number and try number. The try number refers 
to the set of passes being processed. The segment 
number (e.g., 1035) and try number (e.g., 1) are put 
together into one number (e.g., 10351) and then put 
into two channels as with blob number. 


NDATA+10 (Optional) Segment index number. 


In addition to the generation of a mean output tape, the differences 
between versions 3.1 and 2.0 of STRIP include 

1. A bug causing incorrect results for the first line is removed. 

2. Bad pixels, indentified by a blob number of 0, get a 0 in the 
strip channel but ad.jaeent pixels are not stripped. Version 2.0 
had a line-numbering problem in this case. 

3. A provision is made to put out the last line on tape. This 
featui-e works only for tape output; if the output is a DATUM 
vector to a subsequent module, the last line will be incorrect. 
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4. The scheme for indexing active blobs is the same as for BLOB 3.0; 
therefore, STRIP 3.1 should be used on BLOB 3,0 output. 


HOW TO USE: 


STRIP is a POINT module with control input in Step 1. As has been said 
before, it is used correctly only on the output of BLOB 3.0. The STRIP 
pixel output is correct for the last line only if the output goes directly 
on tape rather than to a subsequent module. On both output tapes, one 
field is produced for each rectangle or file of data processed. Don't 
forget to include subroutines QAK and SWAP in the job deck. The control 
input is in READ AND PRINT DATA format as follows: 


Variable Default 

NDATA 0 


TAB 


MB IN 
MUNIT 


IB 


} 


Description 

The number of spectral channels of 
data coming in, not including flags, 
blob number, etc. When NDATA is left 
at its default value 0, the number of 
spectral channels is set to QNDAT, a 
word in the title record. 

When TAB = IB, a mean output tape 
is printed. Wlien TAB ~ OB, the cal- 
culation and printing of the means 
is bypassed. 

When TAB = IB, MB IN must be set to 
the bin number of a tape or to 
$REWIND$, and MUilIT set to a non- 
conflicting tape unit. (Unit 3 is 
usually free. ) 


MFILE 1 Output file on the mean tape. If tv.fo 

files of blob means have already been 
written on the mear’ tape, for £ cample, 
you can start a job with MFILE = 3 
and have all three files on the tape 
with file numbers 1, 2, and 3. 


OB 


IF PRINT = IB, two lines of informa- 
tion are printed for each blob: 


PRINT 
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Variable Default: 

PRINT (Continued) 


MISS 2 


MAXCEL 200 


BCHAN 0 

BCHAN2 0 


Description 

1. blob number, number of pixels 
in the hloj, number of interior 
pixels in the blob, the mean 
vector computed from all pixels 
in the blob 

2. line mean, point mean, the 
mean of the interior pixels 
in the blob. 

If MISS lines go by without any pixels 
from a certain blob on the active list, 
then that blob is considered completed. 
It is printed (if PRINT = IB), stored 
in the mean output tape buffer and re- 
tired from the active list. MISS is 
the same variable as SKIP in BLOB 3.0 
and must have the same value or errors 
will occur. 

The maximum number of blobs; on the 
active list. MAXCEL must be larger 
than the maximum length of the active 
list in the preceding BLOB run or erfhrs 
will occur. This number occurs in 
the BLOB printout with the message 
"MAXIMUM NUlffiER OF BLOBS ON A LLNE =" . 

It does not save time to make MAXCEI. 
small, only space. 

Wlien these variables are left at 0, 
it is assumed that the blob number is 
sorted in the last two channels of the 
'nput data vector. If these variables 
are given positive integer values, then 
the quotient after dividing the blob 
number by 512 is assumed to be found 
in channel BCHAN2 and the remainder 
ir. channel BCHAN. Thus BCHAN2 is the 
first of the two channels containing 
the blob number. 
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B120 

ENOUGH 


OUTS 


The 

ID 

I CHAN 


120 The mean output tape has an artificial 

line length of B120. Normally, this 
variable would not be mentioned and 
the line length left at 120. 

1 The blob mean putout on the mean 

tape is normally the mean of the in- 
terior pixels if there are any, and 
otherwise is the mean of all the pixels 
in the blob. But you can set ENOUGH 
to a number > 1 and then the interior 
pixels are used only if there are at 
least ENOUGH of them. 

OB ETien data from the mean output tape 

is going to be used by R. Hieber’s 
program 17RAND, you set OUTS = IB, 
which has the effect of making the 
number of channels a multiple of 8. 

The unused channels are filled by 
zeroes . 


following control variables are read in Process input: 

blank ID can be set with a segment and- try 

number as IDO) and ID(2) or the last 
two values of the NSA vector. If left 
blank, STRIP will take the segment num- 
ber from the title record and assign 
a try number of 0. 

0 I-Jlien ICHAN is given an integer value 

in Process input, that value is stored 
in channel NDATA+10 of the mean output 
tape. It provides a quick way of refer- 
encing the segment try. 


2pi 
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APPENDIX II. 3 

USER DOCUMENTATION OF BCLUST 
(W. Richardson) 


ROUTINE: BCLUST 

VERSION: 1. 0 

DATE: Noveinber 1977 


PROGRAMMER: W, Richardson 


LANGUAGE: XTRAN 

SYSTEM: Amdahl, MTS 

INTERFACE: An 11- line module to he used after ]7RANn. 


PURPOSE: To group hlohs from many segments into spectrally similar 

strata. The clustering may include ancillary, as well as 
spectral variables. 


SUBROUTINES NEEDED; 


Subroutine RANKP is needed in addition to the standard 11-line 
subroutines. 

BCLUST reads control input in NAMELIST format. Variables not 
mentioned remain at their default values. It also reads a table of 
ancillary variable data, if requested, and a transformation EV intended 
to make the distributions representing different materials and varieties 
approximately sphericauL. 

The blobs are clustered by assigning each blob, in turn, to the 
cluster it is nearest to or to let it be the start of a new cluster if 
the smallest distance is greater than a parameter TAU. The EV trans- 
formation allows Euclidean distance to be used. The clustering can be 
iterated by starting a subsequent pass with clusters obtained on the 
preceding pass. How much weight to give tie clusters on the previous 
pass is controlled by a variable UPFAC. An iteration may be made in two 
steps by writing the cluster means and sizes into a file and then 
reading that file to seed clusters on the subsequent pass. 

The input tape or file is in multispectral ^^ormat but with each 
blob playing the role of a pixel. The first channels are the blob mean. 
Other channels give the line mean, the point mean, the blob number, the 
number of pixels in the blob, the number of pixels in the blob after 
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Stripping the boundary pixels, and the segment number. The output is 
the same as the input except that a channel containing the cluster 
number is added. 


The control variables and their default values are as follows n 


Variable 


NSEG 


NDAT 

NAV 

NY 

MAXCEL 


Default 


40 


9 

15 

26 

100 


Meaning 

Number of rows to be read from the ancillary 
data file. Channel 19 on the blob input tape 
(or file) indicates which row of the ancillary 
data table is relevant. 

Number of spectral data channels to be used. 

Number of ancillary variables to be used. 

Number of EV- transformed channel values to be used. 

Maximum number of B clusters permitted. This 
number cannot be increased without recompiling 
the program with an increased value of the literal 
DM in line 3. 


TAD 10. 


Distance limit determining whether a blob should 
start a new B cluster. 


BCHAN 0 Channel value the B cluster number will be stored in. 

If left at the default value, the B cluster channel 
number will be set to the number of input channels 
+ 1 . 


NMIN 3 Minimum number of blobs in a B cluster to 

permit iteration or seeding of that cluster. 

UPFAC 1. If you are seeding or iterating, DPFAC 

determines how many blobs you pretend 
are already in the seeded cluster. If UPFAC 
= 1. , the pretend number is the actual number 
of the preceding iteration. If uPFAC = 0., the 
pretend number is 1. UPFAC > 0 and < 1 gives a 
number that is between 1 and the number in the 
cluster proportionately. UPFAC > 1 puts even 
greater weight on the number already in the 
cluster. 


yERIM 
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CREAIE 

UPDATE 


TAUFACi 

GOODNOf 

NEWANC 

CHAN 

GHANA 


T If CREATE is T (short for .TRUE.) then a new 

cluster is created whenever the minimum distance 
is ^ TAU. 

T When UPDATE is T, then the mean of a cluster is 

updated every time a blob is included in the 
cluster^ If no seeding or Iterating has been 
done, the updating has the effect of placing the 
cluster mean at the arithmetic mean of the data 
values of the blobs in the clusters. When seeding 
or iterating is done, then it is possible to 
endow each l luster with an artificial number of 
points which have the effect making the means 
less moveable. 

0 On iteration, it is possible to bring the number 

60 of B clusters toward a chosen number G00el.^J0 of 

clusters. When TAUFAC = 1. 

new TAU = of clusters on pr^C-fding pas's ^ 

When TAUFAC =0., no such automatic changing of 
TAU takes place. TAUFAC between 0 and 1 makes a 
proportionate change between these two extremes. 
This option applies to iteration only, not seeding. 

T IF NEWANC = F, then no ancillary data is read. 

NEWANC is set = F if NAV = 0. This option would 
be used on a second pass when the ancillary data 
had already been read into the ancillary variable 
table. 

1, 2, 3, ... CHAN (1) ... CHAN(NDAT) is the subset of 

data channels to be used in the B clustering. 

1, 2, 3, ... CHANA(l) ... CHANA(NAV) is the subset of 

ancillary variables to be used in the B 
clustering. The ancillary variable are 
considered to be numbered as follows; 


1 

longitude 

2 

latitude 

3 

elevation 

4-7 

four Julian dates 

8-11 

four view angles 

12-15 

four gammas 
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/ 


HERAT F 


SEED F 


READEV T 


IT I If 1,. . 

NITj 19 


DEBUG F 


You set ITERAT = T for a subsequent pass on 
which you want to set the cluster means at 
the values of the previous pass. Only those 
clusters with NMIN or more blobs are used. For 
updating purposes, each such mean is associated 
with an artificial number of points ranging 
from 1 (UPFAC =0.) to the number in the 
cluster on the previous pass (UPFAC =1.) 

|[a problem with uting ITERAT is that you 
generally don’t want an output tape on the 
first pass but do on the second. I have 
not tested out ways of overcoming this 
problem.^ 

When SEED = T, then it is assumed that you 
are running a subsequent pass and that the 
number of points, in each cluster and its mean 
vector has been saved in a file which is 
read by the I/O assignment 16. The saving 
on the previous pass is done by writing into 
I/O assignment 17. The previous comments on 
NMIS and UPFAC apply, but the TAUFAC option 
does not. 

READEV = T means read an NxN matrix EV from 
I/O assignment lA for the purpose of transforming 
the data to a space where the distributions are 
approximately spherical. N is NDAT + NAV, On 
a subsequent pass using ITERAT, you should turn 
READEV off or the program will hang up. 

If READEV = F, the EV matrix is not read. An EV 
matrix previously read would then be left 
unchanged. The default is the identity matrix. 

IT (I) = 0 means don't use this segment. 

I refers to a segment number that is found 
in channel NIT (presently 19). This channel 
is not the usual segment-try number (e.g. 

10201) but rather an index number between 
1 and AO that is used not only for the IT 
option but also to access the row in the 
ancillary variable table to be used. 

When DEBUG is T, debugging printout is 
written for every blob. I'Jhen DEBUG = 

T, you generally specify a very small 
number of blobs to be processed. 
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The input-output assignments are as follows. 4 = *SOURCE*, 

6 = *S1NK*, 8 = *PRINT*, 11 = the multispectral input tape or file and 
10 = the multispeGtral output tape or file, as is standard for 11-line 
modules . 

The aneillary data tahle is read from 15 if NEWANC is left at its 
default value of T. The successive lines of 15 atre read unformatted 
and put into the rows of the table. The row number cottesponds to the 
segment index number in channel 19 of the input, so that the table can 
be accessed efficiently. 

The EV tratisforraation to make the distributions approximately 
spherical is read from unit 14. It is assumed to be a square matrix 
stored row by row on one line of 14 and read unformatted. The size of 
the matrix is NGHAN by NGHAN, where NGHAN is the total number Nt) AT + NAV 
of variables. It defaults to the identity matrix. 

If the SEED option is used, cluster means are written unformatted 
on unit 17 at the end of one pass and then read unformatted on 16 on 
the next pass. 
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APPENDIX II. 4 
ANNOTATED LISTING Di ’;?CT 
(W. Hclsztynski) 

WCT is a program to select Q(l) + QC?) +,... segments to represent 
(NH-NDEL) segments. The main program reads an incidence matrix, IM, 
which is the number of blobs in each spectral stratura/segment combination. 

Subroutine GET01 converts the incidence matrix to an array of 
zeros and ones, where a zero is inserted whenever the number of blobs 
is less than LB, the "lower bound". 

Subroutine KRS computes the value of various segment choices. 

Suppose 4 segments are to be chosen in groups of two. The main program 
calls KRS which first picks the best pair and returns to the main program. 
The main program again calls KRS which pick the best complementary pair 
and returns. 

Subroutine NEXKSB is called by KRS and provides KRS with all 
combinations of available segments taken 2 at a time (in the above 
example) . 

After aJLl segments have been chosen, the main program calls sub- 
routine FINCH to make the final choice of training blobs within the 
training segments. The total number of blobs to be chosen may be set, 
(NTB), or the proportion of them (NPR) . The choices are distributed 
proportionally among the spectral strata and further distributed among 
the training segments. In the particular version listed here the 
distribution of blob choices depends upon the number of pixels in eaci 
spoctral stratum rather than upon the number of blobs, and so the 
program also reads a pixel array, FIX. 

Control data is entered in NAFIELlST format when prompted by the 
program. The listing follows on the following four pages. 
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1 

2 
3 
u 

5 

6 
7 

7 , 7 ! 

a 

9 

10 

n 

12 

13 

14 

15 
lb 
IT 
Ifl 

19 
?Q 
21 
22 
23 
?a 
2b 
2b 
27 

20 

29 

30 

31 

32 
51 
34 
55 
3b 
37 
IH 
19 

40 

41 
“2 
41 
4U 
45 
ub 
41 
4b 
(JB 

50 

51 

52 

53 

54 

55 

56 

57 
5B 


IFMpfT >CT,n 

J»UN *FTm r=2 PifitSsAf.nuHCE* P»*PB|N1* TEST F.«« 

i»*p(. icjT t5Ttr,r9(*«T) 

C t Troiril T/1 Ll51 IS tsM5iPN*t3 3»Pl*lPVi|3 5»*3nu<»CF* b»»SINa* 0iS3P»jli 1 3gb2 
inr.lCM. MTC.G4P 

OIMtNSinN tu<IOOrbOl,H(vnr>,6al«FNC«iO)«tH(l6)fTTfaO)f 

|R(lM,»irth),H) nnn)rClbO)fn!rib)rCH{lbtrMR(|60,t>0),T6f40) 

!,PI*(10rt,b«1 

OAU F" /I«0,20t4* |3»0/,NT9,NP«2300»0/r 
n/I. l5*0/,L9/ri2/,BM»NV/uO» 100/, TO/00»0/, wnEt/O/ 

N*MFl T ST/90 a T*/tD, NOEL, FHfNV,NH,LB,0, NTS, NPB 

Ion v*L=o 

i,p=n 

t;*ps.F*LSF, 
on 40 I»!.I5 

IF(FAn).EO.O.*»lO,EW<T*n.Nt,9) GtPb.TaUE, 

40 CD'iTI'IuF 

UFaTNO I 
HfAtNO 3 

UF»0(j,2) NV, NN 
«9ITtrb,bl 

b FnuMAT!' n»4EI 1ST, HOOT* *) 

»F-^n^5,9l)al A ) 
on 7 I 5 1 , rt 0 
7 TUI I = I 

IF tNOFL.FO.OT GO Tn 9 

DO n 1-1, bOFL 

IF rToio.cF.i ,*Nt). ri>fn,Lf,4 P) goto* 

-9 1 Tf (A, 1 01) NOFL, Tn 

tOl fOBH*! f< NOFL T5, TO » '» IOli-0/ 120*, lOJlO) ) 

STII9 

S T T ( Tl)( 1 1 ) s 0 

9 -B]TI (A. BnatA) 

? Fn4"4TMbI5l 
on 5 h 1=1 , bv 
5b HL(Tj=n 

0" 3 1=1. UH 

cm = n 

3 = t 

BFArtf > ,?H ( ibfi.J), 1=) ,NH),Ts1 ,WV> 

9Fin(3,2> trPTrft»J>,:ia1,NMT,t«1,NV) 

□n I] >ii ,'iH 

I in II I r1 , NV 

a r. V I ,‘itiF 

FAIL F4 TuH'‘,5V,HH,ia,TT) 
on an i .fti , «h 
fjh nni Isl.HV 
40) MP( I , n=M( l.J) 

jrrfi 

II Am=F,.m 
on 5 f»2, lb 
•f nsHi i-t )«tviin 
5 tn^TlMUE 
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«)0 


VCNV 




*>s 


on 10 I *1.1 6 

*3 


lF(n(T)*Eo.n](iO rn lo 



K»Qfn 



VI. *VAL 



C»LL <MS(V,7,V,HL,H»,l),vLt") 

h7 


WOITtf b, oojvL 

be 


17iv*i..se.vL.n7,G*p)cnm lu 

b*) 


l«»l7Ff6» niV*L 

70 

M 

Fr'b“ATt' 7(1 ffljJF CI1L-S OF *‘»r Vit.(lE. VAL»',I10) 

7| 


coin Si 

72 

l.u 

on lb J=i,K 

73 


jrs,)c«i 

7b 


CH( JC1=EN(BtJ) ) 

75 

IS 

ctcwfjcnsi 

76 



77 


U«0 

7» 


on ?0 LalfH 

79 


IF (rcF.'jCL' ).OF.O) tin TO ?l> 

flo 


llaLIU 

aj 


bNlLllaFrifLI 

ea 

20 

c n>i I f)i HE 

<*3 





on 35 J=1.M 

as 


on 3b L=l,v 


25 

xRtL , J)='ifL,t7(.I)) 

»7 


VAL>vl 

«6 

to 

cn*<Tt7'iF 


32 

IF (.ic.to. ftlGUti; SS 

<»0 


-7! Tt (hF2HF‘’< n i Isl , JCl 



-“[TerblbUJVAl,. 

»2 

lilt 

FfiMRAT t ' VA| s: * , 1 t 0) 



GflTM 37 

i»« 

33 

••WtTribi 30) 

7b 

So 

fnii-iTt' All t« LEsa than ufl'i 

76 


cn Ti] (00 

77 



7(J 

ru 

CALL FiNCHtPlY, fH,NV,*lH,C7..1C.0PK#7T3) 

77 


GO TO 170 

Iflu 


too 




1P2 


Sl'nHllMl lilF GF TtM (T*4,NV*NH,Ln,TT) 

103 


OI^EnSIOn l‘i(10n»b0) 

1 All 


IMTFGFH TT(iiOT 

1«5 


DO in J=l,bH 

1 76 


on 20 1 = 1 ,'iv 

107 


IF(t-(l,Jl,uT.L7 ,0». TTljy ,FO. 0) GOTO 5 

lOfl 



107 


on TI) 20 

lib 

5 

j7( f ,JJ=0 

1 1 1 

20 

CM'lTINijF 

112 

10 

CObTlVOF 

113 


FFTuwv 

I 1:M 


t*iU 

Hb 



116 


SnOPiJUTlNF aMOfif.R^Vi^L.MiB.vl ,6) 

117 


IGTFGFH H,V,.*Tl,t100J.‘TMflSib11,0fiO»Vt. 

ns 


Ih) .ni ( too) ,M?tl UO) ,v| 

117 


l.nGTCAL iTt,rt*D 
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1 

a 

s 

u 

5 

7 

7.? 

8 
9 

I 0 
I I 
12 
13 
1 a 
15 
tb 

17 

18 
l<> 

?0 

2t 

22 

23 

2a 

25 

2b 

27 

20 

29 
To 

31 

32 
T3 
39 
35 
3b 
37 
3H 

30 

90 

91 

92 

93 
99 
95 
9b 

97 

98 

90 

50 

51 

52 

53 
5u 
55 
5b 

57 

58 

59 


IFHPTY »CT,n 

JRUN AfTv: To2 P»«r5a«Rli|l9Cr« |,««CT,0 Ps*pai>lT* TEST 
tHPl.ICtt TNTEnfor*-?) 

C A TtPITAI 1/1 LI51 IT lBT«T3Pti*t3 TsPlll3PS*|3 SbaSHUBCFa b«»SI9K» •■'<32 n* 1 3882 
Lf'GICAL RTCiGAP 

0lMEb5irTN lo( IOOrba)»HMOt).6fl>,E9(brtl<EH(tb)f T7(afl)f 
IRt ih),Rf ibj jMi ( tonitCfbn),n(ibwCH( tb)FKR(ioo,bO),TOt«o> 

1 ,PI*(lon>bP> 

DMA F> /I Un,20.9,tT*n/,NT8,NPH/S80*02r 
1D/li ITAO/.LR/lZ/.NH.NV/uO.JTO/t T0/O0*02t NPEt/0/ 

NAmR ISTV»0»U/TD,NnEl,FR,5V,NM.LB.U.**T8,«IP» 

Ion vae=o 
dp=o 

GAPB.FALSF, 
on '111 I»i.l5 

IF (Fi.m.FO.O,A»tO,EPUtn,NF,8) GAPb.TRUE, 
an cOnTpiuf 

RF^INO 1 
3 

«FA0tT,21 byi*'9 
RFAPM,?) NV( NH 
-RlKfbibl 

b FORWATC oAMEI ITT.HtlATA ') 
pF.mf >i,R()Al A ) 
on 7 i=], 90 

7 TT(I) = 1 

IF (^0FL.FIJ.1M GO Tfl 9 

dh 0 1 = 1 , worL 

IF ftPID.Gf.l ,A«n. TI)f ti.LE.'H’l GH TO 8 
►OITF (b, 101 1 50FL. TD 

-lot FOHOAT (' NDFt 15, TO « *• .alto/ 120*, lOIlOl ) 

STiiP 

ft T T ( 7 D { n 1 3 0 

9 WOITF ibjOOltS) 

2 Fn9"Arnbi5i 
nn 5b 1=1, 5V 
Sb HL n i = 0 

fin ^ T3t,«H 
Cl I 1 3 0 

I F-.(Tj = t 

uFatk 1 ,2) f (TMf I, Jl,.t=l ,9Ml,Ta1,NV) 

RFAr(3,2) (fPTYf I,J1,.|3l ,«tMl,l=I,NVt 
Dfl 9 JsI fNH 
i <n 9 J r 1 , N V 
«t I,J)3 Tn(I,J 1 
, C»3' 1 .‘luF 

CAiL i:FTOt{‘'.''V,NW,tB,TT) 
ijn UOT ,131 ,“JH 
f}n 11(11 Isi.HV 

901 "PI !, n=«(l,ji 

jrsn 

II Rfi)=r«U) 

DfT 5 7*2,16 

■ f i“i ) ♦t'ltn 

5 CONTINUE 


H«NK 
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l?u 


b*0».T»ilE . 



on 0 1 = 1 ,v 

1?? 

u 

Mim=ML(n 

l?i 


V 1 =0 



CALI wMkSH(H,k,A,mTC) 



on s t=i ,v 

IPO 


on 10 jsi.K 

l?7 


NA=A( J) 


H 

"I ( n=«l C I jAHf I, NA) 

l?V 


H i“i n 1 .FQ.oitnm 5 



vi=vi*«(Mi (tn 

1^1 

S 

Cn-iT|*)iiF 

J V 


lF(K.*lF.|>Cnlr||7 

Hi 


*iP|Tffh,iOHI,(“im,T«I.W),A<l) 


50 

pno-ATC VH'.H,* 111', 5011,* Ae*,I2) 


17 

HCvi.LF.vLHnrn ?i 



HAUs.FALSF. 



on IS 1 = 1 , K 

1 'H 

H 

of n = ‘(T) 



VL = VI 

HO 


Oft PO 1 = 1 ,v 

HI 

20 

‘*?(n=”im 

H^ 

21 

H(“Tr)GiiTii 5 

Hi 


H(IUO) RFHiRt.| 

H<t 


on ?S l=l,v 

lUS 

2S 

ML(n=*t?tr) 

Hb 


Sf TIlPO 

IU7 


tSD 

H« 



luv 


SI"40(|I>TINE >it»KSft(N.K,A,MTC) 

ISO 


P.TFr.fW A(K1, H 

ISl 


unotCAL HTC 

IS.; 


(lATi *IL*ST/0/,K| AST/0/ 

Hi 

H 

IF{K.«)t.K|.AST.t|0.'J,Nt.SLASnr.n TO 20 

IS« 

JO 

lF(“Tri (in TO no 

ISS 

20 

-2s0 

Ho 


S = K 

H7 


'll ASTsv 

1S.J 


H aSTsk 

ISO 


•iTC=.THilt. 

160 


i;n To so 

HI 

00 

on U] M=1,K 

H2 


M? = A(K«HH) 

Hi 


1F(“2 .i-F.*i«I-H) tn TO so 

H« 

0! 

CnsTI'iiiE 

HS 

50 

on Si j=i,H 

Hb 

51 

a(h+J-m) = '*2*J 

H7 


"TC=(*(1) .NF.S-k*l ) 

H6 


OFTMRS 

HO 


EOD 

|7u 



17J 


Si'HWii'iT IliF F I'lCMfPl*, IM.Ntf »0H,CM, JC»nPH,»«TB?) 

172 


INTFtFO Iu(|on,b0),CH(tS),T.t>(|00»tb)«KT(100) 

|72.S 


C.bOl, PT(IOO) 

l7i 


MTm = STB? 

1 7« 


'iTHtS » 1 

17S 


t = 0 

17s, 


TPsO 

177 


on i 1=1, ov 

17» 
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APPEtIDIX III 

A NOTE ON THE RATIONALE FOR PARTITIONING 
IN LARGE AREA REMOTE SENSING SURVEYS 

CR. Kauth and Richardson) 

In carrying ou: large area rei^te sensing surveys the remotely- 
served image data may not, by itself, be sufficiently definitive of 
the classes or conditions we seek to survey; ancillary information, 
such as weather reports and ptior years' histories will often be 
required. The remote-sensed image features themselves may be the 
result of a preprocessing transformation on the original image data, 
e,g, , a haze correcting transformation. How can the ancillary data 
be used in conjunction with the remote sensing image features to 
classify a scene? 

Suppose we have a disjoint set of recognition classes, i = 1 m 

and a discrete random variable w which takes these values. Then 
the method of classification is to choose that w (i.e., choose 
that class) whose posterior probability is a maximum given both the 
feature and ancillary data observations. In symbols, 

choice » max prob (wiy,v), 

where y is the image feature data and v Is the ancillary data. Manipu- 
lating this expression by Bayes rule, 

max prob (w[y,v) * ** max prob (w.y.v) 
w w prob (y ,v) 

* max prob (y,v|w) prob w 
w pf bb (y ,v) 

Since for each observation the values of y and v are fixed while 
the expression is maximized over w, the denominator can be ignored, 
thus the rule is 
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max prob (y.vjw) prob w. (1) 

w 


This Is just the maximum likelihood procedufe for y and v taken 
as a single vector observation with prior Weights assigned to the 
classes. Thus the proper way to use ancillary data is to add ancillary 
channels to the data, train over a wide range of ancillary conditions, 
and classify. 

The Immediate objection to this approach is that our present large 
scale system structure is not set up to handle these added channels in 
the mainstream processor. In addition, the training function would have 
to be organized on a very large scale. Finally, in order to actually 
implement this approach, confidence in the fundamental concepts of 
likelihood model classification is required. 

The first of the above problems can be surmounted by a modified 
form of the maximization problem, thus. 


max prob (w|y,v) 
w 


=> max 
w 


prob (w_,y.,y). 
prob (y,v) 


a max 
w 


prob (y I V ,w) prob (w I v) prob v 
prob (y,v) 


= max 
w 


prob (y|v,w) prob (wjv) 


( 2 ) 


With this formulation one trains the likelihood model for 
prob (y|v,w). When processing a segment, the values of v for that seg- 
ment are used to set the signatures for processing the Image vectors, y. 
The conditional weight must also be trained, but its accuracy Is not 
critical. Thus, In this formulation we have moved the problem of 
establishing signatures off line and allowed the use of a conventional 
image processor but the problem of training is just as severe as before. 


r.'" I ' ■,,, I 1 ■ l 
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An approximation to the above procedure can be obtained by parti- 
tioning the survey region Into bounded domains of the ancillary variable 
and then training and classifying within each partition. Thus the 
objective of partitioning is to make a step-wise approximation to the 
likelihood function 

prob (y!w,v) = (y|w), v e 

where is the region covered by the k^^ partition. 

If the training were completely representative over each partition 

thent 


prob(y{v,w) prob(v) dv 


(ylw) 


*k 


prob (V) dv 




Also the prior weight can be approximated » 
prob (wjv) = \(w). V e 


where 


prob (w|v) prob (v) dv 


r^(w) 




prob (v) dv 


Thtis the classification rule becomes, 

\ <y|w) rj^(w) 
w 


<3) 


The effect of partitioning is to increase the variance of the 
resulting signatures relative to what it would have been if the more 
correct formulation, Equation 2, were used. 
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Thus , 

Var (ylw>) L Var (prob (y|vj^)) 

where Vj^ is the value of the aicllXary variable at some arbitrary point 
in the partition. The equal sign would be employed only if prob(v) 
in Equation 3 were a delta function, 6 prob(yjv,w) = 

prob (ylw) independent of v. Neither of these conditions is likely in 
practice, therefore the partition-wide signatures have increased variance 
and increased classification error results. 

An additional source of errors Is failure to train sufficiently 
within the partition, leading In effect to errors in estimate of 
prob {y|v,w). These errors can be expected to decrease asymptotically 
to zero as the number of training samples increases. The number of 
training samples Is limited by the size of the partition; hence there 
is some partition size at which the sum of errors from the spreading of 
the likelihood model and the errors from insufficient training Is a 
minimum. 

Finally, an additional reason for increasing the size of parti- 
tions is to minimize the cost of training, by increasing the ratio of 
the number of recognition segments to the number of training segments. 

The above setting gives us a rationale for partitioning. Parti- 
tioning followed by within-partitlon training and classification is an 
approximation to a more correct procedure contained in Equation 2. The 
Increased variance of the partition-wide signatures causes Increased 
classification errors, making It desirable to decrease partition 
size. This is balanced by training errors and cost considerations which 
make it desirable to increase partition size. 

Methodology for Estimating a Metric for Clustering Ancillary Data 
A reasonable approach to partitioning Is to cluster available 
segments within the space defined by their ancillary data. In order to 
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use such a method some reasonable metric Is required, l.e., some measure 
of the relative importance of the various ancillary variables or com- 
binations of them. The metric is expressed in the form of a covariance 
matrix ) that is used to determine which cluster each v point (l.e., 

V 

each segment) belongs to. V characterizes the relative stability of 
prob (yjv,w) as v changes in different directions. It will, in general, 
be quite different from the covariance matrix of the distribution of v. 

The problem of partitioning is thus transformed into the problem of 
finding a reasonable estimate for 5!^. Ln the case of the wheat survey, 
this estimate might be based on an estimate of the wheat signature alone 
or on performance using both the wheat and confusion crop signatures or 
on a training sufficiency basis. We propose, as a flrsL step, to esti- 
mate 5^^ using only the signature for dryland wheat. 

Consider a dryland wheat probability density function which is a 
joint function of both the feature variable y and the ancillary varia- 
ble V. Assume that the distribution of y and v is joint multivariate 
normal and that this description is valid over a region within which 
we expect to create partitions. We let z be the joint vector 

C be the covariance matrix of z 
z 



where is the covariance matrix of y and C^, the covariance matrix 

of V. We let the mean be zero for the convenience of representing y 

and V as deviations from the mean. In short, z Is normal (0, G ), 

z 

We are, as will be seen, particularly interested in the conditional 
distribution of y given v. This Is 


prob (yjv) 


prob (y,v) ^ prob (z) 
prob (v) ” prob (v) 



= constant • e 


(A) 
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We can write the argument. A, as 


, „ T„-l T„-l 
A=zC z-vC V 

Z V 


= (^) 


N, I c G 

T / y yv 


T 

C C 

yv V 




c c 
y yv 


)“■( 


Q Q 

y yv 


c'^C/ 

y V V # » y V V 


Then, 


=(0 


T /Q Q 
/ y yv 

\ Q ^ Q 

\ yv V 


il) - 


(y'^Q + v*^Q y^Q + v**^Q )(^) - v'^c’^v 

V y yv ^ yv ^v/'v/ v 


T T T T T T — 1 

y Q y + v Q y + yQ v + vQv-vC v 
^yv * ^yv v V 


yVy + 2yV,v + v’’ (q^ - c;^)v 


1 


I 
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By the Roy and Sarhan theoreotf 

C - Qv - V Vv 

A - y^Qyy + 2y''Qy^» + q^l q^^). 


= (Choiesky decomposition) 


-1 -1 -1 

Q = K K 

y y y 


A = y\ \ y + 2y^Q v + v^Q 'hThC^ Q v 
y y yv yv y y yv 

' <K,y + K‘^\,v)''(Kyy + K’^’^Qy^v) 

- tKy (y + RyV^'^qy^y)]’^ (Ky(y + K'^x’^'^Qy^v) ] 

- (y + 

Thus prob(yjv) has a mean of -Q~^Qy^v and a covariance independent 
of the value of v. 

Now we wish to establish a distance measure in the ancillary varia- 
ble space which will represent the importance of any deviation v from 
its mean of 0 in causing change in the distribution of y given v. A 
measure of this sensitivity is the probability of observing the point y 
at its mean 0. We have 


prob (y = 0|v) “v e 


-f (q"^Qy^v)’^qy(qy V,v) 


■ 
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and V values for which Che argument Is constant can be said to repre- 
sent equally Important shifts in v. 

T T -1 

A ■ '' V\ ’yS V’' 

• v^(Q )v 

If we let *= Q ^Q~^Q » then 

^yv y V ^ 

A = (7) 

V 

Thus we have found that 

1 T 

-J V 1, 

prob(y « 0|v) = e ^ 


This expression can also be viewed as a distribution g(v) in v 
space reflecting the stability of prob(y|v> because an equally dense 
contour of g(v) in v space corresponds to a constant prob(y = 0|v). 

the covariance matrix of g(v)» is thus an appropriate covariance 
to use in clustering the points of v* 

We have shown that if we cluster the segments by clustering their 
ancillary data values v using the covariance matrix 

I ** <Q^ Q~^Q ( 8 ) 

we will give proper relative weight to the ancillary variables in 
terms of their ability to effect the dryland wheat signature. 

In order to carry out the above prescription it is necessary to 
obtain an approximate description of the covariance of the vector 
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over the entire region which we hope to partition. But it should be 
bom in mind that it is not critically important how accurate this 
description is, since it is not critically important to be very accurate 
in choosing partition boundaries. However the matrix can tell us 
some important things. The least eigenvector of this matrix points in 
the direction of the most critically Important ancillary variables. At 
the opposite extreme the eigenvector whose eigenvalue is largest points 
to the least important linear combination of ancillary variables. 

The size and number of partitions can be controlled by varying the 
distance limit, a clustering parameter that determines when a point 
should start a new cluster. 

It is quite possible that the unconditior.al covariance of v, C , 

V 

is singular or nearly so, especially since some of the ancillary varia- 
bles may be highly correlated to each other. Then, even though is 
not near singular, (being a projection of the feature data, y, onto 
the ancillary data space) it would not be possible to compute it by 
Equation 8, since and will not be calculable, since will be 
singular. The possibilities for surmounting this difficulty include 
using a generalized inverse; taking only the beat linear combinations 
of v components; and adding a small fixed noise variance to the matrix 
before making the calculation. 




APPENDIX III.l 

HOW TO FIND A METRIC FOR PARTITIONING THE ANCILLARY VARIABLE SPACE 
WHEN THE COVARIANCE MATRIX OF THE ANCILLARY VARIABLE IS SINGULAR 

(W. Richardson and R. Kauth) 


The discussion requires the use of certain theorems about the generalized 
inve’*se of a symmetric matrix. These theorems are given in Part 1. 

The substance of the appendix, a continuation of Appendix III, is given 
in Part 2. 

1. THE GENERALIZED INVERSE OF A REAL SYMlfETRIC MATRIX 

In this section, the generalized inverse A”^ of a real symmetric matrix 
A is defined. A method for finding this inverse is summarized in 
Theorem 1. Two other results that are needed will be proved as Tlieorems 2 
and 3. 

Odell and Boullion* define a generalized inverse or '*pseudoinverse" 
of the complex matrix A as a matrix X satisfying the following four 
conditions: 

AXA = A 
XAX = X 
(XA)T “ XA 
(AX)'^ = AX 

If A is real and symmetric, the generalized inverse can be found as 
follows: 

There exists an orthogonal matrix P such that 

T 

PAP = diagonal 

* Books discussing the generalized inverse are given as references 
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To fix our ideas, let us suppose that the non-zeto diagonal elements 
(eigenvalues) are in the upper left. P Is the matrix of the eigenvectors. 


PAP*^ = 



Let us suppose, further, that A is positive semide finite as it will 
always be if it is a covariance matrix. Then a,, .... a^. are all > 0. 
Let 


Then 



SPAP'^S = 



def Ij. 


If we let T be the matrix SP and it X has covariance matrix A, then 
Y = TX has covariance matrix = I^. 

Odell and Bouillon give the following formulation of a matrix satisfying 
the first condition AXA = A. Find non-slhgular matrices P and Q such 
that 


Term 
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Then 


X - QI^P 

satifies Ehe first condition AXA = A. 

Proof: To prove: A QI^PA “ A 

l.e., to prove: A Q Ij-PAQ = AC) 

i.e. , to prove*. P A Q 1^ PAQ = PAQ 

l.e., to prove: Ij. which it is. 


A formal proof takes these steps backward, first preinultlplylng by P 


-1 


then postmultlplylng by Q 


rl 


You can go backward or forward In these 


proofs when the matrices are non-singular. 


This same X also satisfies the second condition 


XAX = X 

Proof: To Prove: QI^PAQI P = QI P 

^ t r 

The PAQ In the middle = Ij.. Hence, we must prove that QIj.I^Ij.P = QI^P, 
which it does. 

For A real, symmetric and positive definite, SP plays the role of P in 
the above arguments and pf s plays the role of Q. SP and PTS are 
non-singular and (SP) A (P^S) = 1^,. Hence, (p"^S) Ij. (SP) has been 
shown to satisfy the first two conditions of A^. 

We will show that P SI^SP satisfies the last two conditions also and is 
therefore A^. We first observe that A and X = P^SIj.SP are both symmetric. 

xT = pTsTij.s'fp = pTsTl^SP = X 
because = S. Therefore 


(XA)'^ = kV - AX 

To prove: AX = XA 


i.e. , AP'^SIj.SP = pTsij.SPA 

(SP)AP'*'SI^SP = (SP)p'^SIj.SPA 


i.e. , 



because SP is non-singular 

i.e.p (SP)ApTsirSP(pTs) = (SP)P'J^SIi-SPA(P’^S) 
i.e., (SPAP'^S)lpS(PpT)S = S(PPT)SI^(SPAP^S) 
i.e., which it is* Q,E.D. 

Finally* we must prove that 

(AX)*^ = AX 
i.e., X^A^ = AX 

i.e., XA = AX which is what we have just proved. 

Thus, P^SlpSP is the unique pseudo-inverse A"^ of A. 

You might think that with all these desirable properties, AA'*’ would = 1^., 
But such is not the case. If AA"^ - Ij., which Ij. would it be? The 
one with positive values in the first r places? The last r places? 

A scattered subset of size r? 


f 

If AA+ = Ip, then 

• * • ®ln 


V : 

AA'tA would = Ij-A = 

Srl • * ' ®rn 

o o 

^ ■ ■* 

^ A 

i 


so property 1 would not hold. 

If we try to prove AA'*' - Ij. by previous methods we would try to prove 

AP'*'SI^SPT= Xj. 
i.e., SPAP^SI^SP^= SPIy 
i.e., Ij-lrSP = SPIr 

The left side is a matrix of the first r rows of SP and zeroes elsewhere. 
The right side matrix has the first r columns of SP. So the attempted 
proof fails . 

Thus, we conclude that the pseudo inverse of a covariance matrix A 
(real, symmetric, positive semldefinlte) is 

P^SI^SP 


j.'’, 


:/ 

i 



In words, you obtain the pseudo inverse by finding an orthogonal matrix P 
such that PAP^ Is diagonal, take the reciprocal of the diagonal elements, 
which we have called 



and transform back, P'^(SI S)T. 

The same procedure works for a real symmetric matrix A whether positive 
semidefinlte or not* The only shift in thought is that S is complex. 

But SI^S = S^I|- is real, so A^ has the real factors P^(SIj,S)P. Our 
conclusions are summarized in: 

Theorem 1 . To find the generalized Inverse of a real symmetric matrix A 
find an orthogonal matrix P such that 



The rows of P are the eigenvectors of A and the diagonal values 

a^, o, ..., o are the corresponding eigenvalues. The pseudo- 
inverse A"^ of A is 
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Theorem 2 . If a matrix A is of the form 

v;here B is non-singular » 


then 


(::) 


'* ■ (r :) 


Proof: The four properties can be shown to hold by matrix multiplication 

applied to submatrices. 


Theorem 3 . If B = PAP*^, where P is orthogonal, then B"*" = PA‘*T^, 

Proof. To prove; BPA'*^'*^B = B 
i. e. , P^BPA'^P'^^BP - P^BP 
i.e., A A”*" A = A which it is 


T 

because A = P BP. Property 1 proved. 
To prove; PA'*T’^BPA"'‘P^ = PA**‘p'^ 


which it is because P^fiP in the middle = A and A'^AA'^ = A"^. 
proved. 

To prove: (BPA’^P^)^ = BPA^P**- 


i.e., 
i.e. , 


p^+TpTgT ^ 


+ T 
BPA^P^ 


pTpA‘*'%'^B'’^P = P^BPA+pTp 


Property 2 


i.e. , A’^'^’a'*^ = AA'*' because p'^^P = I and pVp = 
i. e. , (AA^)^ = AA**" which it is 

because A"*" has Property 3. Property 3 proved. The proof of Property 4 
is analogous. 


2. A METRIC FOR PARTITIONING THE ANCILLARY VARIABLE SPACE WHEN THE 
COVARIANCE MATRIX IS SINGULAR 

We suppose, as in the previous appendix that the spectral observe ''ions 
y and the ancillary variables v jointly have a covariance matrix 



and that the raxm covariance matrix Is singular, having rank r < m. 
Then there exists an mxm orthogonal matrix P such that 



where D Is diagonal. The rows of P are the eigenvectors of Cy and 
,,,, Oj., o, o, the corresponding eigenvalues. We get the 

eigenvalues In the upper left, for convenience of notation, by rearranging 
the rows of P If necessary. This does not destroy the orthogonality 
of P. 

Let 


'^r+l * 


w = Pv 


cov w = PCyP^ = 



Wjjj are constant because they have zero variances. 




■c:)C)^-(:) 


« ■{; :) (i* I’fi c y 



/Cy CyyP^ \ 

\PCy/ PCyP^y 


( 1 ) 


r 


t 


! 


I 


I 
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In this matrix of submatrlces , 


PCyP^ 


e:) 


Now if wj is constant} gov (w^, y^) = 0 because 

GOV (w^, yj^) “ Cw^yj - Cw^Cy^ 

® wjByj^ - wjCyi “ 0 
T 

Hence, the bottom m-r rows of PCyy are 0 and the rightmost m-r columns 
of CyyP^ are 0. So 




GOV 


(v) ' I Vyv* ■> 


lO 0 

where consists of the first r rows of P. 
We define 


cy+ - 

Gy - 


y ^yv] 

Qyv”^ Qv 


as in Appendix III, except that the Inverse Is now the generalized 
inverse. Because 

= rgJJr’', 

it follows from Theorem 3 that 


QyvP^ 

PQyP^ 




(2) 


after algebra like that of eq^tlon (1). By Theorem 2, since is 


bordered with zeroes, so is C: 


y+ 

W 
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‘ Hence , 



Qy QyyP/ 0 

PlQyV^ E 0 

0 0 0 


(3) 


In Appendix III, we defined the metric in ancillary variable space as a 
covariance matrix 

^v “ ^Qyv ^yv^ 


In a clustering operation, we would assign a point v to a cluster with 
mean v^ if the distance 

(V - V(j) (V - Vq) 

were a minimum for all cluster means. Thus, it is 

y 

that would define the metric used for clustering if were non-singular. 
From Equation (2) we would similarly define the clustering metric in 
w space as 

= PQy/Qy'lQy^p'^ 


From Equations (2) and (3) we infer that 


Hence , 


PQ 


yv 


-Q) 


<yV 


QyvP “ Qyv 




^ \0 Oj 

We note, in passing, that because Mlj, has rank r, 


(4) 


P^M^jP has rank r. 
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The result of Appendix III holds only for a noii^slngular covariance 
matrix of ancillary variables. Hence> we apply it to the set of variables 
w^, .... w^ and obtain the metric 

When this metric is embedded in the full w space, it becomes as given 
in Equation (4). The last m-r values of w are thus provided for, but 
they do not affect the distance calculation. The distance from a w point 
to a cluster mean in w space is 

T 

(w - w )*M (w - w ) 
o w o 


= (w - w„)’'(PQy,’'p^-lqy^P^)(w - w^) 
= (pT„ - pTw„)''(Qy^'''Q^'‘Qy^HpTw - 


T 

Now w = Pv so V = P w. Let v and v^ be the same setting of ancillary 
variables as w and but expressed in v coordinates, the distance 
just defined is 


(v - Vq) (Qy^''^Qy"^Qyy) (v - Vj,) 

Thus, My = Qy^*^Qy~^Qyy is the metric to be used for clustering in v space. 


3. CONCLUSIONS 

The lengthy discussion in this appendix results in a simply-stated con- 
clusion, as follows. Let 

("’ , 

\V ‘^v/ 

be the Inverse of 

il) • 

if it has one, and the generalized inverse otherwise. The metric for 
clustering in ancillary variable space is 

^yv ^y *^yv 
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This metric is the same as the one derived in Appendix* III except that 
it is more usefully defined as the matrix M in the distance 

(v - Vq)^M(v - Vq) 

rather than as MT^ = ly. 


To find the generalized inverse C^'*' of cov (^) find the matrix P of 
eigenvectors (by rows) and the corresponding eigenvalues 

• f Of •••! 0» 



oJ 



APPEHDIX III. 2 

INVARIANCE OF THE CONDITIONAL COVARIANCE 
(W. Holsztynski) 

1 . INTRODUCTION 

The main goal of this note Is to prove a conjecture in linear algebra 
of R. Kauth and W. Richardson. Suppose that we are given a- and Y-dimensional 
vectors 



In the context of inultispectral data processing, the y and v vectors repre- 
sent specfai and ancillary data vectors, respectively. We can introduce 
(d+y) -dimensional vectors 



and matrices 

^ ~ * jXjj] » V — [ v^ , . . . , Vjj] , z - ^v] ~ f » * • • > 


I 
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\ \ "k = \ 


-la 


and y, V and z are the respective means: 


y = f (yi + • • • + y] 


- 1 - r^i 1 

,) , V = ^(Vj^ + ... + Vjj) . * " I 1“ 


Suppose that the rank of Z is a+Y, i.e., that the covariance matrix of Z 


y = ^ ZZ^ 

^ N 


is a non-singular matrix (it follows that 


I = 4 I = i" 

Lyy }J i-VV N 


are also non-singular) . Then we may consider the multivariate normal 
probability density 


d{z) =* ke 


1 t 

^ 2 i z 


where z 


One can also compute the conditional density function ^ (y v) , where 


. r 


I 


I 


fl 
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In order to do It let us write 


( 10 ) 

where A and G are otxa and yxy matrices respectively. Then 

- |-(y + A"^Bv)'^A(y + A”^Bv) 

d(y|v) = ve (11) 

by Equation 6 of Appendix III.l or by Section 4 of this appendix. Observe 

that A ^ Is the covariance matrix of yjv and -A^^Bv is the mean. By sub- 
t -1 

stltutlng Sin for A and u for Si Bv, we may write the above formula in a 
compact way: 

- + u)*‘{Si*'y + u) 

d(y[v) = ve (12) 

Because of this and for some other reasons R. Kauth decided to replace 
vectors v^,..,,Vjj by u^ = n ^Bv^,...,Ujj — Si Richardson 

ran the above compu .^tions again in order to compute the new covariance 
matrix A-j, of the conditional density d(yju) (corresponding to the old 
covariance matrix of (11) above), and the new mean value A^ of d(yju) 

(the old one, given by (11), was A ^Bv) . They were surprised to find that 

Ay = A and AjJ^ByU = A ^Bv, (13) 

i.e., the conditional covariance matrix and the conditional mean value were 
invarxant under the linear map 

L = 



(14) 
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Since the concrete matrix Z they started with was sizeable and without 
any pattern, they conjectured that (13) holds always, granted that the 

F 1 t 

covariance matrix of u, ^ invertible; here 

"[u] * ^ a 

The invertibility of the covariance matrix of is discussed in 
Section 2. In Section 3 the proof of the conjecture Is given. Section 4 
contains an alternate derivation of the conditional density of y|v. 

2. INVERTIBILITY OF THE COVARIANCE OF Z^ 

We keep the same notation as in Section 1. In particular: 


A = 




L = S2“^B 



(16) 

\ = ^^k 

for 

k=l , . . . ,N ; 

(17) 

U = [n , . . 

• ♦ 

==. = [3- 

(18) 


We want to know when 

Ib = s Vv 

is a non-singular matrix. It is known that this happens if and only if 
the rank of is 2a, i.e. , when there are 2a linearly independent vectors 
among 


’^l' 




y • * « y 
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(observe that are a-dlmensional vectors) . But we do not want a con- 
dition which involves all 

1 N 


Remark 1 

In what follows, in this section and in Section 3, the results are 
valid even if Y, V and Z do not have mean 0 (in other words yj^, Vj^, 
of (3) can be replaced by y, , v. , z ). 


Theorein 1 

Jy is non-singular if and only if rank B = a. (We always assume that 

V 1 t 

2, = — ZZ is non-singular in first placel) 


Proof 

Since ^ is non-singular there are indices 

l<i, <...<i <N and l<i, <... <j <N 
— 1 Y — — ~'l ■'a — 


such that , . . . , v^ are linearly inde-'endent in R’ 

^1 y 

Gi 

linearly independent in R , and i^ for any n=l,.. 


are 

^1 -*« 

,y and m=l , . , . ,u . 


To justify this statement, we can choose a+y linearly independent z vectors 
from among the N given and consider them as a matrix, each column of which 
is a z vector. The top a values of each column is a y vector and the 
bottom y values, a v vector. The determinant of the matrix can be 
expressed as a sura of terms, each of which is a minor determinant in the 
top half times a complementary minor determinant in the bottom balf. 

Because the determinant of the matrix is not zero, there is a pair of com- 
plementary minors whose determinants are not zero. An acceptable set of 
indices corresponds to the columns of the top minor and a dis- 

joint set L,...,i corresponds to the columns of the bottom minor. Thus 
X Y 

the proof of the statement is established. 
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If rank B = a then there are cx linearly Independent vectors among 

-1 -1 
Bv^ , . . . , = n BVj 

1 H Y Y 

(because when a matrix D Is non-singular, rank of DC = rank of C). 

This means that there are 2a linearly independent vectors among n+Y vectors 




— — ^ 







^i 

Y 








f 


* • • * » 


c 

I-*- 


c 


hi 





— — 


1 .^ 


Thus is non-singular. 

Conversely, if is non-singular then there are a, and not more than a 
linearly independent vectors among Uj^, u^j. Thus rank B = rank L = a. 

Theorem is proved. 

Remark 2 

The obvious necessary condition for 5^^ to be non-singular is y ^ a* 

The class of examples below show that this condition is not sufficient. 

example 1 

Let linearly independent vectors 

a = [a^^, . . . , a^J e R for k=l, ...,a*fY 
k i 

be such that a is orthogonal to a whenever 
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Next, let 


and Vj = 


a+1 


a+Y 


for 1 *= 1 , • • • » N. 


t 

Then rank Z = a + y. Thus ^ is non-singular. On the other hand YV = 0, 


^ ■ 6 V j 


y = i 

L M 


YY 


Q W 


It follows that y ^ is of the form 


r' = N 


t -1 
(YY ) 




Thus B = 0 and rank B = 0. Thus, by the above theorem, is singular. 


The simplest example of this type is; 


<3. - y = ly 




- 1 


^2 


= 0 


V 


1 


= 0 


« 1 . 
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Then 


2l = zz" = 


and B = 0. 


10 10 1 G - . 

= = A y~i 

0101 01 2^ 


The simplest example with the averages y* v (and z) equal 0 is: 
a - Y = 1* 


Tl-l 


= -1 yj - 0 


''l ■ 1 


V^ = 1 


Vg - -2; 


and again B = 0. 

Remark 3 

Rank B != rank L - rank A 


3. PROOF OF THE CONJECTURE 

In this section we assume that both 

Z„-[b] [’fV] 

are non-singular. 

Theorem 2 

” A "1 

■ 

i+N(uu*^) 
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Proof 

Since = I hence 


AYY*^ + BVY*^ = NI 


and 


AYV*^ + BVV^ = 0. 


Indeed, we have 



where P and S are axa and yxy matrices respectively. 

P = AYY*^ + = AYY*^ + £2(n"^BV)Y^ 

= AYY*^ + BVY^^ = NI (by (20)) 
and 

R = AYU*^ + nUU*^ = AYV*^(n“^B)*^ + n(fi"^BV)v‘^(fi“^B) 
= (AYV*^ + BW*^)(fi“^B)*^ = 0 



Then 


(by (21)). 
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Since R=0 hence 

AYU'^ = -JiUU*^ 

t 

and substituting for A, 

-1 t "1 t 

-Q = (UU ) UY . 

Now we can compute Q, 

Q = + (I+N(UU*^)"^)UY*^ 

= £1”^(AYy‘^ + nUY^^) + N(UU*^)“^UY*^ 
= NJ2~^ - = 0 (by (23)). 

Also, by (22) and (23), 

S = + (I+N(UU*')"^)UU*^ 

= iZ”^(AYu‘^ + fJUU*^) + NI = NI. 


The theorem is proved. 

The above proof can have a more conceptual finish. After 

r-1 

formula (24) was displayed we could see that must be of the form 




» 


where F and G are some axa matrices. Since is symmetric, so is 
Thus 


F = 


(25) 


(26) 


<2 


2 ^ 
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Finally, since 

NI = FYU*^ + GUU*^ = + OU*^ + (G - DUU*^ 

= n'^CAYu*^ + nuu*^) + (g - i)uu*^ = (g - i)uu*^ 

hence 

G = I + NCUU*^)"^. 

The theorem is proved again. 


Corollary 1 

Let us consider the multivariate normal distribution which has 


1 tv-1 
■ 2 ^ ^ 


du<z) 


- ce 


(2 e R^“), 


as the density function. Put 


■ 


where y,u e R 


Then 


dy(ylu) 


- i (y + A"^flu)^A (y + A~^£2u) 

pe 


Because u = fl 
A~^ttu 



Bv 


(as in (11)). 


L. 
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Thus d(yjv) and d^(yju) have the same mean and covariance matrix, proving 
the conjecture. We note further that 


so dy(y[v) can also be written in the form 


d (yju) = pe 


-1 -1 
^ y (y + u)*'A (y + u) 


The axa matrix YU is non-singular, 


Proof 


By (24), 


AYU^ + ilUU*^ = 0. 


t t-^ t 

YU = -n’' UU*". 


which is non-singular because UU is a principal minor of the positive 
definite matrix 


Remark 

Put L = L = 

V 


If we apply the same operation again, then A = A 


should be replaced by A^ which happens to be A again. And = B should 
be replaced by B^ = n. Thus we see that the second iteration of Kauth's 
operation 


L = n“^B = = 1 

u u u 


is already identity. The process stabilizes after the first step. 


2PI 


FORNEAir WiLtOW RUN UAB'OR*TORiE5 frit ONiVER&lTl Of MlCMlilAri 


4. ALTERNATE DERIVATION OF THE CONDITIONAL DENSITY OF y|v 

Let M be an arbitrary positive definite symmetric (ct+y) x (ii+y) 
matrix. We can write it in the form 


M 


' A I B ' 
_ _ I 

t* 

Lb I cj 


(27) 


where A and C are axa and yxy matrices respectively. Then A and C are 
positive definite and symmetric. 

Consider the normal zero-mean density function 


d(z) = ice 


1 

2 


zSz 


for z e R 


u4y' 


(28) 


Write z as 


z = 


■y" 

.V. 


Ct Y 

where y t R and v e R , 


We want to compute d(y|v). Observe that 


z^MZ = 


r^i 

t 

A 

B 



r 

< 

1 


y 

C 


J 


y^Ay + v*'B^y + y^'Bv + v^Cv 


= (y + a“^Bv)*^ A(y + A'^Bv) + v*^(C - B*^a'^)v. 


(29) 


(30) 
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to compute the marginal density of v, we consider the integral 


-f 


(y + A A(y + 


A ^Bv) 


Ke 


dy 


Except for the constant <, it has the Coriri of an integral of a multi- 

Ot 

variate normal distribution over R , the space relevant to y. It is 
thus invariant to the choice A ^Bv of the mean and y integrates out. 

A is a constant depending only on A (but not on v, y, or B) . 

Thus the marginal density of v is 

J t t -1 

- ^ V (C - BA B)v 

D(v) = .^e 


The conditional density 


is given now by 


- -2 (y + A A(y + A ^Bv) 


d(ylv) = ve 


We conclude that the conditional density Q{y|v) is normal; A 
is its conditional covariance and -A ^Bv is its mean value. 


-1 


(31) 


Thus 


(32) 


(33) 


( 34 ) 
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APPENDIX IV 

LAST MINUTE RESULTS AND AN EXPLANATION OF THE 
SIGNATURE EXTENSION PROBLEM 

This appendix is written and added in filial proof. We feel tliat 
the information contained here will add eonsiderably to the unde rst and i np 
of the power and limitations of Procedure B. 

We have satisfied ourselves that a major source of error is simply 
tlie use of a large partition. We were led to this conclusion by finding 
tliat some large spectral strata which are nearly pure wheat in a majority 
of the sample segments are non-wheat in a certain small subset of the 
segments resulting in large overestimates (Segments 116J, 1165 and 1167). 
Spectral strata which are nearly pure non-wheat for a majority of the 
segments have significant wheat percentages for a different small subset, 
resulting in sizeable underestimates (.Scgmcnt.s 1861 and 1865). These 
examples are shown in Figure 1, which is a portion of a table of true- 
percent wheat by spectral stratum and segment, and in Figure 2, which 
is a reproduction of Figure 10 (Section 5) showing key segment numbers. 

We have run the spectral stratification program, BCLUST, with a 
variety of parameter settings and using a variety of distance measures. 
Emphasis has been on using a large number of small spectral strata, and 
the above statements are verified in each case. Hence we conclude that 
the non-wheat in Segments 1163, 1165 and 1167, for example, cannot be 
.separated from the wheat in the majority of the segment.s on the basis 
of spectral information alone. Some other information, either as input 
to a signature model or as a partitioning variable, must be used. 

The most noticeable fact about the segments identified in Figure 2 
is their position in the State of Kansas. Figure 3 shows these same 
segment.^ and their relationship to others in the State-. The as.soc i ;it ion 
of the two figures is obvious. Hence we sought supporting information 
in other ancillary information besides geographical po.sition. 


wnxwis TVHXDacis 

FIGURE 1. PORTION OF A TABLE OF TRUE PERCENT WHEAT (BETWEEN THE LINES) ANJ 
THE NUMBER OF PIXELS (ON THE LINES) BY SPECTRAL STRATUM AND SEGMENT. (Some 
of the segments are represented by two different combinations of passes.) 


1020(1) 1020(2) 1035 L041 1154 1165(1) 1165(2) 1851 1852 186! 1865(1) 1865(2) 1886 
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FIGURE 2. REPRODUCTION OF FIGURE 10 (SECTION 5) 
SHOWING KEY SEGMENT NUMBERS 
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FIGURE 3. LOCATION IN KANSAS OF THE SEGMENTS STUDIED 
SHOWING KEY SEGMENT N LUMBERS 
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We first considered the "crop calendar", as illustrated by the 
Robertson Triquadratic Model to compute biometeorological time, output 
from CCEA(NOAA) . These data were provided as part of the LACIE Yield 
Estimation Subsystem Weekly Meteorological Sununarxes. An example is 
shown in Eigure 4 which gives the crop calendar adjustment for April 18, 
1976. Although there is a slight east-west trend, the dominant change 
is from north to south. It appears doubtful that the differences in 
signature are explained by this source. In general, the crop calendar 
is defined by the dally maximum and minimum temperatures, and in general 
temperature gradients are maximum in the north-south direction. Further- 
more, the average crop calendar Is partially compensated already due to 
the fact that LACIE acquisition windows are defined by the average crop 
calendar. 

Next we considered the various precipitation Indices available 
from the Weekly Summaries, as well as the general reports of crop pro- 
gress and condition. As is well known in the LACIE community, the 
1975-76 season was a drought year in many parts of the Croat Plains. 

In the State of Kansas we find an easily observable correlation with 
crop moisture indices. Figure 5 is a typical early season crop moisture 
index. The eastern part of KANSAS had received adequate moisture — the 
southwestern corner was "too dry, yield prospects reduced". The western 
half was "abnormally dry, prospects deteriorating". 

This overall pattern was maintained from September 1975 through 
February 1977. Associated lack of snow cover exposed the west and 
southwest portions to winter kill. By March farmers were abandoning 
fields . 

The National Almanac shows the long term average precipitation 
index to have essentially east-west gradients across Kansas. In 1975-76 
this gradient was unusually strong, with emphasis on drought in the 
southwest corner of the state. 
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FIGURE 5. TYPICAL EMILY-SEASON MAP OF THE CROP MOISTURE INDEX 
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Nov. 1.1975 
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One might expeet that the development of the crop would be delayed 
but this was not the case. In fact an early warming trend brought winter 
wheat fields out of dormancy earlier than usual throughout the state , 
providing they had survived the drought . The crop development stayed 
ahead of normal throughout the season. The principal effect of the 
drought upon surviving crops appears to have been to reduce the ground 
cover of the crop. 

We can see in this pattern of events a rationale for the pattern 
of missed wheat detections evident in Figures 2 and 3. The "greenness" 
of the wheat signature is a function primarily of ground cover, given 
that the leaves themselves are green. Grass or pasture contains a great 
deal of dead material, so that the new shoots are In part hidden and the 
potential for green cover is initially small, compared to that for wheat. 

In this case, however, the wheat signature is in fact dominated by 
the large dry central region of the state (even though, out of the six 
training segments, one each is chosen from the east artd the southwest) 
and this signature matches grass/pasture in the eastern part of the 
state. In the southwest the wheat fields are of such low ground cover 
that they look like grass/pasture in the central section. 

The way to test this conjecture is to incorporate crop moisture 
index as an ancillary variable to be used either to partition the data 
set or to make the signatures dependent upon it. At the present time 
we do not have these data available in a suitable form. However the 
overall trend appears to be that the average gradient is lined up with 
the progression of longitudes. Hence we have elected use segment 
longitude as an ancillary variable in a preliminary attempt to improve 
results . 

The method we have chosen is to incorporate longitude along with 
spectral variables into the stratification program, BCLUST. The result 
Is to effect a joint spectral and spatial stratification. We have 
attempted this approach previously using linear combinations of many 
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ancillary variables and have found that great care is needed to keep 
the ancillary variables from dominating the stratification. 

Figure 6 shows the result of applying this procedure. These 
estimates are notably improved over previous ones. The bulk of the 
observations are near the 45^ diagonal (dashed line) and in particular 
the observations which previously were wild are close to the line. An 
overall slightly positive bias Is evident. There is one segment, 1883, 
identified as in Figure 3, which is greatly overestimated. In the 
joint spectral-spatial stratification this segment formed a stratum 
unto itself; but unfortunately the ground truth had not been exhaus- 
tively prepared for this segment so it could not be used for training, 
hence this wild estimate. 

One is tempted to believe that the east-west pattern (actually a 
surrogate for the east-west moisture pattern) explains the Procedure B 
performance and signature differences satisfactorily. However the 
results are compromised by the fact that we used the true proportions 
from all the sites in looking for an explanation. For this reason we 
are not stating the results as a conclusive test. Instead we express 
the explanation for the results in terms of a hypothesis to be tested; 
namely, winter wheat spectral signatures, especially the "greenness" 
component, are more significantly related to ground cover and thus to 
available soil moisture, than they are to departures of temperature 
conditions from the long term average. 

An empirical study of spectral- temporal signature patterns is 
needed to further reinforce the viewpoint developed here. We expect 
to carry out such studies along with other ancillary data studies in 
the future. 
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1883 



FIGURE 6. ESTIMATED VS. TRUE WHEAT PERCENTAGE FOR 17 SEGMENTS 
WHEN LONGITUDE IS INCLUDED IN THE STRATIFICATION 
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